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The present invention relates to a gene useful in a process to increase the microbial pro- 
(faction of carotenoids. \ \ 

: ; \ 

The caroteuoid astaxanthin is distributed iri a wide variety of organisms such as animals, 
algae and microorganisms. It has a strong ^ntioxidation property against reactive oxygen 
species. Astaxanthin is used as a coloring reagent, especially in the industry of farmed fish, 
such as salmon, because astaxanthin imparts distinctive orange-r?d coloration to the ani- 
mals and contributes to consumer appeal iri the marketplace. \ 

One of the first steps in the carotenogehic pathway of, e.g. Phaffid^ rhodozytna* is the con- 
densation df two molecules of acetyl-CoA. Acetyl-CoA is also the Substrate for acetyl-CoA 
carboxylase, one of the enzymes involved in fatty acid biosynthesis. 

' ! i 
In one aspect, the present invention provides a novel DNA fragment comprising a gene 

* 

encoding the enzyme acetyl-CoA carboxylase, * 

\ j 
More particularly, the present invention provides a DNA containing regulatory regions, 

such as promoter and terminator, as well as;the open reading frame of acetyl-CoA carb- 



oxylase gene. 



I 



The present invention provides a DNA;fraginent encoding acetyl- (CoA carboxylase in P. 

rhodozyma* The said DNA means a cDNA Which contains only open reading frame 

flanked between the short fragments in ; its 5;- and V- untranslated region, and a genomic 

j » 
DNA which also contains its regulatory sequences such as its promoter and terminator 

which are necessary for the expression of the acetyl-CoA carboxylase gene in P. rhodozyma. 

■ \ \ 

HEI/sk 27.09.2002 \ \ > 

t • '. 
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Accordingly, the present invention relates {o a polynucleotide comprising a nucleic acid 
molecule selected from the group consisting of: ! 

(a) nucleic acid molecules encoding a* least the mature form of the polypeptide depicted in 
SEQIDNO:3; ; | 

5 (b) nucleic acid molecules comprising; the coding sequence as depicted in SEQ ID NOS; 

(c) nucleic acid molecules whose nucleotide sequence is degenerate as a result of the 
genetic code to a nucleotide sequence of (a j or (b); ) 

(d) nucleic acid molecules encoding a polypeptide derived from &e polypeptide encoded 

by a polynucleotide of (a) to (c) byway of substitution, deletion and/or addition of one or 

10 several amino adds of the amino acid sequence of the polypeptides encoded by a poly- 
nucleotide of (a) to (c): •: I 

(e) nucleic acid molecules encoding a polypeptide derived from the polypeptide whose 
sequence has an identity of 56.3 % or more;to the amino acid sequence of the polypeptide 
encoded by a nucleic acid molecule of (a) or (b); ] 

15 (f) nucleic acid molecules comprising a fragment or a epitope-beL'ng portion of a poly- 
peptide encoded by a nucleic acid molecule; of any one of (a) to (4) and having acetyl-CoA 
carboxylase activity; • ; j 

(g) nudeic acid molecules comprising a pofynudeotide having a iquence of a nucleic add 
molecule amplified from Phaffia or Xahthoj>hyIomyces nucleic adk library using the 

0 primers depicted in SEQ ID NO:4, 5, and 6; j 

(h) nudeic add molecules encoding a polypeptide having acetyl-^oA carboxylase activity, 
wherein said polypeptide is a fragmentlof a polypeptide encoded by any one of (a) to (g); 

(i) nucleic add molecules comprising at least J 5 nudeotides of a polynudeotide of any one 
of (a) to (d); j i j 

> (j) nudeic add molecules encoding a polypeptide having acetyl-QoA carboxylase activity, 
wherein said polypeptide is recognized'by antibodies that have be£n raised against a poly- 
peptide encoded by a nucleic add molecule -of any one of (a) to (h); 
(k) nudric acid molecules obtainable by screening an appropriatajlibrary under stringent 
conditions with a probe having the sequence of the nudeic acid njolecule of any one of (a) 
to (j), and encoding a polypeptide having aii acetyl-CoA carboxylase activity; 
(1) nudeic add molecules whose complementary strand hybridizes under stringent condi- 
tions with a nucleic add molecule of ariy onfe of (a) to (k), and encoding a polypeptide 
having acetyl-CoA carboxylase activity! j \ 

3 ) 
The terms "gene(s)", "polynucleotide", "nudeic add sequence", "nucleotide sequence", 

"DNA sequence" or "nucleic add molec^ e (s> as used herein refer* to a polymeric form of 



92:01 ^*$:LZ J jazsaue^duig 



nucleotides of any length, either ribonucleotides or deoxyribonuc;lcotidcs. This term refers 
only to the. primary structure of the molecule. \ 

» ; 

Thus, this term includes double- and single- stranded DNA, and ^NA. It also includes 
known types of modifications, for example, 1 methylation, "caps" substitution of one or 
more of the naturally occurring nucleotides with an analog. Preferably, the DNA sequence 
of the invention comprises a coding sequence encoding the abov^-defined polypeptide. 

A "coding sequence" is a nucleotide sequence which is transcribed into mRNA and/or 
translated into a polypeptide when placed under the control of appropriate regulatory 
sequences. J The boundaries of the coding sequence are determined by a translation start 
codon at the 5-terminus and a translation &op codon at the 3'-tdrniinus. A coding 
sequence can include, but is not limited to mRNA, cDNA, recombinant nucleotide 
sequences or genomic DNA, while introns may be present as well junder certain circum- 
stances. SEQ ID:1 depicts the genomic DNA in which the intron sequence is inserted in 

the coding sequence for acetyi-CoA carboxylase gene from P. tho\ozyma. 

* m . 

In general, the gene consists of several part£ which have different iunctions from each 
other- In eukaryotes, genes which encode a 'corresponding proteiji, are transcribed to pre- 
mature messenger RNA (pre-mRNA) differing from the genes for'ribosomal RNA (rRNA), 
small nuclear RNA (snRNA) and transfer R^A (tRNA). Although RNA polymerase II 
(PolII) plays a central role in this transcription event, PolII can n<>t solely start transcrip- 
tion without cis element covering an upstream region containing a promoter and an up- 
stream activation sequence (UAS), and a *r<ms-acting protein factor. At first, a transcrip- 
tion initiation complex which consists of several basic protein components recognize the 
promoter sequence in the S'-adjacent regiori of the gene to be expressed. In this event, 
some additional participants are required in the case of the gene Which is expressed under 
some specific regulation, such as a heat shock response, or adaptation to a nutrition 
starvation, and so on. In such a case, a.UAS is required to exist in^the S'-untranslated 
upstream region around the promoter sequence, and some positive or negative regulator 
proteins recogni2e and bind to the UAS- Thfe strength of the binding of transcription 
initiation complex to the promoter sequence is affected by such a binding of the trans- 
acting factor around the promoter, and this fenables the regulation^ of transcription activity. 

After the activation of a transcription initiation complex by the phosphorylation, a tran- 
scription initiation complex initiates transcription from the transcription start site. Some 
parts of the transcription initiation complex are detached as an elongation complex from 
the promoter region to the 3' direction of thie gene (this Step is catfed as a promoter 

5 ! I 
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clearance event) and the elongation complex continues the transcription until it reaches to 
a termination sequence that is located in the 3*-adjacent downstream region of the gene. 
Pre-mRNA thus generated is modified in nucleus by the additibjnj of cap structure at the 
cap site which almost corresponds to die transcription start site, and by the addition of 
polyA stretches at the poIyA signal which is located at the a'-adjatent downstream region. 
Next, intron structures are removed frbm the coding region and exon parte are combined 
to yield an open reading frame whose sequence corresponds to tHe primary amino acid 
sequence of a corresponding protein. This modification in which a mature mRNA is 
generated is necessary for a stable gen«f expression. cDNA in general terms corresponds to 
the DNA sequence which is reverse-transcribed from this maturejmRNA sequence. It can 
be synthesized by the reverse transcriptase derived from viral species by using a mature 
mRNA as a template, experimentally \ ; f 

To express a gene which was derived from eufcaryote, a procedure in which cDNA is 
cloned into an expression vector for K coU is often used. This resjults from the feet that a 
specificity of intron structure varies amongthe organisms and an| inability to recognize the 
intron sequence from other species. In factj prokaryote has no injxon structure in its own 
genetic background. Even in yeast, tluj genetic background is different between Asco- 
mycetes to which Saccharomyces cerevisiae belongs and Basidiomycetes to which P. rhodo- 

r J. 

zyma belongs, e.g. the intron structure;of the actin gene from P. rhodozyma cannot be re- 
cognized nor spliced by the ascomycetous yeast, S. cerevisiae. 

Intron structures of some lands of the gene! appear to be involve^ in the regulation of the 
expression of their genes. It might be important to use a genomic^ fragment which has its 
introns in a case of self-cloning of the gene of a interest whose intron structure involves 

such a regulation of its own gene expression. ? 

• t i 

To apply a genetic engineering method for 4 strain improvement study, it is necessary to 
study its genetic mechanism in the event such as transcription and translation. It is 
important to determine a genetic sequence such as its UAS, prompter, intron structure and 
terminator to study the genetic mechanism.? i 

1 i '■ 
According to this invention, the gene encoding the acetyl-CoA carboxylase (ACC) gene 

from P. rhodozyma including its 5'- and 3'-adjacent regions as wejl as its intron structure 
was determined. i 

i I ! 

The invention further encompasses polynucleotides that differ from one of the nucleotide 
sequences shown in SEQ ID NO:2 (and portions thereof) due to degeneracy of the genetic 
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code and also encode an acetyi-CoA carboxylase as that encoded v by the nucleotide sequen- 
ces shown in SEQ ID NO:2. Further thie polynucleotide of the invention has a nucleotide 
sequence encoding a protein having ah amino acid sequence shown in SEQ ID NO;3. In a 
still further embodiment, the polynucleotide of the invention encodes a full length P. 
5 rhodozyma protein which is substantially homologous to an amin o acid sequence of SEQ 
IDNO:3. : •! \ 

In addition, it will be appreciated by those skilled in the ait that I^NA sequence polymor- 
phism that lead to changes in the amino acid sequences may exist within a population 
(e,g., the P. rhodozyma population). SUch genetic polymorphism in the acetyi-CoA carb- 
10 oxylase gene may exist among individuals within a population dtie to natural variation- 

As used herein, the terms "gene" and "recombinant gene" refer to Wcleic acid molecules . 
comprising an open reading frame encoding an acetyi-CoA carbcfxylase, preferably an 
acetyl-CoA carboxylase from P. rhodozyma^ j 

• \ \ 
Such natural variations can typically result in 1-5 % variance in the nucleotide sequence of 

15 the acetyi-CoA carboxylase gene. Any and all such nucleotide variations and resulting 

amino acid polymorphism in acetyl-CdA carboxylase that are the(result of natural varia- 

tion and that do not alter the functional activity of acetyi-CoA carboxylase are intended to 

be within the scope of the invention. \ ) 

i • ) 

Polynucleotides corresponding to natural variants and non-P, rhodozyma homologues of 
20 the acetyi-CoA carboxylase cDNA of the invention can be isolated based on their homo- 
logy to P. rhodozyma acetyi-CoA carboxylase polynucleotides disclosed herein using the 
polynucleotide of the invention, or a portion thereof, as a hybridization probe according 
to standard hybridization techniques under stringent hybridization conditions. 
Accordingly, in another embodiment, a polynucleotide of the invention is at least 15 
25 nucleotides in length. Preferably it hybridizes under stringent conditions to the nucleic 
acid molecule comprising a nucleotide sequence of the polynucleotide of the present 
invention, e.g. SEQ ID NO:2. In other embodiments, the nucleic add is at least 20, 30, 50, 
100, 250 or moire nucleotides in length,; The term "hybridizes under stringent conditions 1 ' 
is defined above and is intended to describe 'conditions for hybridization and washing 
30 under which nucleotide sequences at least 60% identical to each ojher typically remain 
hybridized to each other. Preferably, the conditions are such that Sequences at least about 
65% or 70%, more preferably at least about 75% or 80%, and evei* more preferably at least 
about 85%, 90% or 95% or more identical tp each other typically temain hybridized to 
each other. Preferably, polynucleotide of th£ invention that hybridizes under stringent 

i ! i 
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conditions to a sequence of SEQ ID NO:2 corresponds to a naturally occurring nucleic 
acid molecule. •; 

i I 

In the present invention, the polynucleotide sequence includes SEQ ID NO:2 and frag- 
ments thereof having polynucleotide sequences which hybridize io SEQ ID NO:2 under 
5 stringent conditions which are sufficient tojidentify specific bindjng to SEQ ID NO:2. For 
example, any combination of the followmgjhybritfzation and wash conditions maybe 
used to achieve the required specific binding: ) 

High Stringent Hybridization: 6X SSQ 0.5& SDS, 100 Mg/ml denatured salmon sperm 
DNA, 50% formamide, incubate overnight with gende rocking at 42°C. 
B High Stringent Wash: 1 wash in 2X SSC, 0.5% SDS at room temperature for 15 minutes, 
followed by another wash in 0. IX SSQ 0.5^ SDS at room temperature for 15 minutes. 
Low Stringent Hybridization: <5X SSC,;o.596 SDS, 100 ng /ml denatured salmon sperm 
DNA, 50% formamide, incubate overnight With gentle rocking ati37°C. 
Low Stringent Wash: 1 wash in 0.1X SSC, b. ] 5% SDS at room temperature for 15 minutes. 



Moderately stringent conditions may be obtained by varying the temperature at which the 
hybridization reaction occurs and/or the win conditions as set firth above. In the 
present invention, it is preferred to use high stringent hybridization and wash conditions 
to define the antisense activity against acetyj-CoA carboxylase gerU from P. rhodozyma. 

The term "homology" means that the respective nucleic acid molecules or encoded pro- 
teins are functionally and/or structurally equivalent. The nudeic acid molecules that are 
homologous to the nucleic acid molecules described above and that are derivatives of said 
nucleic add molecules are, for example^ variations of said nudeickcid molecules which re- 
present modifications having the samejbioldgical function, in particular encoding proteins 
with the same or substantially the same biological function. Theyjmay be naturally occur- 
ing variations, such as sequences from othe* plant varieties or spekes, or mutations. These 
mutations may occur naturally or may be obtained by mutagenesis techniques. The allelic 
variations may be naturally occurring allelic variants as well as synthetically produced or 
genetically engineered variants. Structural Equivalents can, for example, be identified by 
testing the binding of said polypeptide* to antibodies. Structural equivalents have similar 
immunological characteristics, e.g. conipris| similar epitopes. | 

i f \ 

As used herein, a "naturally-occurring-'.nudeie add molecule refers to an RNA or DNA 
molecule having a nucleotide sequencejthat occurs in nature (e.g.Jencodes a natural pro- 
tein). Preferably, the polynudeotide encodes a natural P. rhodozyma acetyl-CoA carboxyl- 
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In addition to naturally-occurring variants' of the acetyl-CoA carboxylase sequence that 
may exist in the population, the skilled artisan will further appreciate that changes can be 
introduced by mutation into a nucleoside sequence of the polynucleotide encoding acetyl- 
CoA carboxylase, thereby leading to changes in the amino acid sequence of the encoded 
5 acetyl-CoA carboxylase, without altering the functional ability of the acetyl-CoA carboxyl- 
ase/ For example, nucleotide substitutions leading to amino acid substitutions at "non- 
essential" amino acid residues can be made in a sequence of the polynucleotide encoding 
acetyl-CoA carboxylase, e*g. SEQ ID NO:2.; A "non- essential" amino acid residue is a resi- 
due that can be altered from the wild-type sequence of one of the ; acetyl-CoA carboxylase 
10 without altering the activity of said ac^tyl-GoA carboxylase, whereas an "essential" amino 
acid residue is required for acetyl-CoA carboxylase activity. Oth&r amino acid residues, 
however, (e.g., those that are not conserved or only semi-conservied in the domain having 
acetyl-CoA carboxylase activity) may liot be essential for activity knd thus are likely to be 

amenable to alteration without altering acetyl-CoA carboxylase activity 

i 

: \ * 

15 Accordingly, the invention relates to pblynucleotides encoding acetyl-CoA carboxylase 
that contain changes in amino acid residues that are not essential|for acetyl-CoA • 
carboxylase activity. Such acetyl-CoA carboxylase differs in amino acid sequence from a 
sequence contained in SEQ ID NO:3 yet retain the acetyl-CoA carboxylase activity 
described herein. The polynucleotide can comprise a nucleotide sequence encoding a 

20 polypeptide, wherein the polypeptide comprises an amino acid sequence at least about 
60% identical to an amino acid sequence ofjSEQ ID NOi3 and has acetyl-CoA carboxylase 
activity. Preferably, the protein encoded by jthe nucleic acid molecule is at least about 60- 
65% identical to the sequence in SEQ lb n6:3, more preferably ajk least about 60-70% 
identical to one of the sequences in SEQ ID>JO;3, even more preferably at least about 70- 

25 80%, 80- 90%, 90-95% homologous trithe sequence in SEQ ID 100:3, and most preferably 
at least aboiit 96%, 97%, 98%, or 99% identical to the sequence in SEQ ID NO:3. ' 

; i \ 
To determine the percent homology of|two amino acid sequences!(e.g., one of the se- 
quence of SEQ ID NO.3 and a mutant form'jthereof) or of two nucleic adds, the sequences 
are aligned for optimal comparison purposes (e.g., gaps can be introduced in the sequence 

30 of one protein or nucleic acid for optinial alignment with the other protein or nucleic 
acid). The amino acid residues or nucleotides at corresponding afnino acid positions or 
nucleotide positions are then compared. When a position in one Sequence (e.g., one of the 
sequences of SEQ ID NO:2 or 3) is occupied by the same amino acid residue or nucleotide 
as the corresponding position in the other sequence (e.g., a mutant form of the sequence 

,5 select, then the moteeute « homojogoi a, ,ha. posMon (i.e.| as „ S ed herein ™„o 

X 

•• f \ 

'» 

» •. ► 

\ 
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acid or nucleic acid "homology" is equivalent to amino acid or nucleic acid "identity")- 
The percent homology between the two sequences is a function of the number of identical 
positions shared by the sequences (i.e.-, % homology = numbers of identical positions/total 
numbers of positions x 100). The homology can be determined j>y computer programs as 
Blast 2.0 [Altschul, Nuc Acid. Res., 25:3389-3402 (1997)1. In this invention, GENETYX- 
SV/RC software (Software Development Co., Ltd., Tokyo, Japan) is used by using its de- 
fault algorithm as such homology analysis Software. This software uses the Lipman-Pear- 
son method for its analytic algorithm.; j 



A nucleic add molecule encoding an acetyl-CoA carboxylase homologous to a protein 
10 with an amino acid sequence of SEQ ID NO:3 can be created by introducing one or more 
nucleotide substitutions, additions or -deletions into a nudeotide;sequence of the 
polynucleotide of the present invention, ^particular of SEQ ID NO:2 such that one or 
more amino acid substitutions, additions o)r ddetions are introduced into the encoded 
protein. Mutations can be introduced' intojthe sequences of, e-g.* SEQ ID NO:2 by 
15 standard techniques, such as site-directed mutagenesis and PCR-Wdiated mutagenesis. 
Preferably, conservative amino acid substitutions are made at on? or more predicted non- 
essential amino add residues. A "conservative amino acid substitution" is One in which the 
amino add residue is replaced with aniamino add residue having a similar side chain. 
Families of amino add residues having similar side chains have been defined in the art. 
3 These families include amino adds with basic side chains (e.g., lysine, arginine, histidine), 
addic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., 
glydne, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains 
(e.g., alanine, valine, leucine, isoleudne, proline, phenylalanine, methionine, tryptophan), 
beta-branched side chains (e.g., threonine, Valine, isoleudne) anc( aromatic side chains 
i (e.g., tyrosine, phenylalanine, tryptophan, bjistidine). Thus, a predicted nonessential 

amino add residue in an acetyl-CoA carboxylase is preferably replaced with another amino 
acid residue from the same family. Altjsrnativdy, in another embodiment, mutations can 
be introduced randomly along all or part ofjan acetyl-CoA carbojfylase coding sequence, 
such as by saturation mutagenesis, and the resultant mutants canibe screened for an acetyl- 
CoA carboxylase activity described herein td identify mutants that retain acetyl-CoA 
carboxylase activity. Following mutagenesis of one of the sequences of SEQ ID NO:2, the 
encoded protein can be expressed recombinantly and the activity of the protein can be 
determined using, for example, assays described herein. ) 

! \ 5 
A polynudeotide of the present invention, e.g., a nucleic acid molecule having a nucleotide 
sequence of SEQ ID NO:2, or a portion thereof can be isolated using standard molecular 



biology techniques and the sequence information provided herein. For example, acetyl- 
CoA carboxylase cDNA can be isolated from a library using all or portion of one of the 
sequences of the polynucleotide of the present invention as a hybridization probe and 
standard hybridization techniques. Moreover, a polynucleotide encompassing all or a por- 
tion of one of the sequences of the polynucleotide of the present invention can be isolated 
by the polymerase chain reaction using oligonucleotide primers designed based upon this 
sequence (e.g., a nucleic acid molecule encompassing all or a portion of one of the sequen- 
ces of polynucleotide of the present invention can be isolated by ihe polymerase chain 
reaction using oligonucleotide primeri, e.g; of SEQ ID NO:4, 5, or 6, designed based upon 
this same sequence of polynucleotide of tite present invention. Vpx example, mRNA can 
be isolated from cells, e.g. Phaffia (e.g.> by the guanidinium-thiocyanate extraction 
procedure of Chirgwin et al. and cDNA can be prepared using reverse transcriptase (e.g., 
Moloney MLV reverse transcriptase of AMy reverse transcriptase available from Promega 
(Madison, USA)). Synthetic oligonucleotide primers for polymerase chain reaction 
amplification can be designed based upon one of the nucleotide Sequences shown in SEQ 
ID NO:2. A polynucleotide of the inventioii can be amplified usihg cDNA or, 
alternatively, genomic DNA, as a template dnd appropriate oligonucleotide primers 
according to standard PCR amplification techniques. The polynucleotide so amplified can 
be cloned into an appropriate vector and characterized by DNA sfequence analysis. 
Furthermore, oligonucleotides corresponding to an acetyl-CoA carboxylase nucleotide 
sequence can be prepared by standard 'synthetic techniques, e.g.> rising an automated DNA 
synthesizer. I ; ? 

\ '. i 
The.terms "fragment", "fragment of a sjequence" or "part of a sequence" means a truncated 

sequence of the original sequence referred to. The truncated sequence (nucleic acid or 
protein sequence) can vary widely in length; the minimum size being a sequence of suffi- 
cient size to provide a sequence with at least a comparable functiq'n and/or activity of the 
original sequence referred to, while the maximum size is not critical. In some applications, 
the maximum size usually is not substantially greater than that required to provide the 
desired activity and/or function(s) of the original sequence. » 

I * \ 
Typically, the truncated amino acid sequence will range from about 5 to about 60 amino 

acids in length. More typically, however, the sequence will be a maximum of about 50 
amino acids in length, preferably a maximum of about 30 amino acids. It is usually desir- 
able to select sequences of at least about 10, hz or 15 amino acids, Jup to maximum of 
about 20 or 25 amino acids. ! / 
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The term '"epitope" relates to specific immunoreactive sites within an antigen, also known 
as antigenic determinants. These epitopes can be a linear array of monomers in a poly- 
meric composition - such as amino acids in a protein - or consist of or comprise a more 
complex secondary or tertiary structure. Those of skill will recognize that all immunogens 
(I e., substances capable of eliciting an immune response) are antigens; however, some 
antigen* such as haptens, are not immunogens but maybe made immunogenic by 
coupling to a carrier molecule. The term "Antigen" includes references to a substance to 
which an antibody can be generated and/or to which the antibody- is specifically 
immunoreactive, : [ 

The term "one or several amino acids":relates to at least one amino acid but not more than 
that number of amino acids which would result in a homology of below 60% identity. 
Preferably, the identity is more than 70% or 80%, more preferred! are 85%, 90% or 95%, 
even more preferred are 96%, 97%, 98&> of 99% identity, \ 

The term "acetyl-CoA carboxylase" or "acetyl-CoA carboxylase activity" relates to enzyma- 
tic activities of a polypeptide as described bfelow or which can be determined in enzyme 
assay method. Furthermore, polypeptides that are inactive in an assay herein but are re- 
cognized by an antibody specifically binding to acetyl-CoA carboxylase, i.e., having one or 
more acetyl-CoA carboxylase epitopes, 1 are also comprised under fche term "acetyi-CoA 

carboxylase". In these cases activity refers t6 their immunological activity. 

: i I 

The terms "polynucleotide" and "nucleic acid molecule" also relate to "isolated" poly- 
nucleotides or nucleic acids molecules.) An "isolated" nucleic acid molecule is one which is 
separated from other nucleic acid molecules which are present in the natural source of the 
nucleic acid. Preferably, an "isolated" nucleic acid is free of sequences which naturally 
flank the nucleic acid (Le., sequences Ideated at the 5' and 3" ends kf the nucleic acid) in the 
genomic DNA of the organism from which the nucleic acid is derived. 

*. 

For example, in various embodiments, the I*NO polynucleotide cin contain less than 
about 5 kb, 4kb, 3kb, 2kb, I kb, 0.5 kb 61 0.1 kb of nucleotide sequences which naturally 
flank the nucleic acid molecule in genomic DNA of the cell from 4hich the nucleic acid is 
derived (e.g., a Phaffia cell). Moreover,' the polynucleotides of theipresent invention, in 
particular an "isolated" nucleic add molecule, such as a cDNA molecule, can be substan- 
tially free of other cellular material, or culture medium when produced by recombinant 
techniques, or chemical precursors or other chemicals when chemically synthesized. 
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Preferably, the polypeptide of the invention comprises one of the nucleotide sequences 
shown in SEQ ID NO;2, The sequence of SEQ ID NO:2 corresponds to the P. rhodozyma 

acetyl-CoA carboxylase cDNAs of the ;invehtion. \ 

• * * 

Further, the polynucleotide of the invention comprises a nucleic Sacid molecule which is a 
5 complement of one of the nucleotide sequences of above mentioned polynucleotides or a 
portion thereof. A nucleic acid molecule which is complementary to one of the nucleotide 
sequences shown in SEQ ID NO:2 is ope wiich is sufficiently complementary to one of the 
nucleotidesequences shown in SEQ ID NO:2 such that it can hybridize to one of the 
nucleotide sequences shown in SEQ ID Nd:2, thereby forming a stable duplex. 

10 The'polynucleotide of the invention comprises a nucleotide sequence which is at least 
about 60%, preferably at least about 65-70%, more preferably at least about 70-80%, 80- 
90%, or 90-95%, and even more preferably *at least about 95%, 96%, 97%, 98%, 99% or 
more homologous to a nucleotide sequence shown in SEQ ID NO:2, or a portion thereof. 
The polynucleotide of the invention comprises a nucleotide sequence which hybridizes, 

15 e.g., hybridizes under stringent conditions is defined herein, to ope of the nucleotide 

sequences shown in SEQ ID NO:2, or a portion thereof. I 

i * 

Moreover, the polynucleotide of the invention can comprise only*a portion of the coding 
region of one of the sequences in SEQ ID NO:2, for example a fragment which can be used 
as a probe or primer or a fragment encoding a biologically active portion of an acetyl-CoA 

20 carboxylase. The nucleotide sequences 1 determined from the cloning of the acetyl-CoA 
carboxylase gene from P. rhodozyma allows for the generation of probes and primers 
designed for use in identifying and/or £loni*ig acetyl-CoA carboxylase homologues in 
other cell types and organisms. Hie prbbe/primer typically comprises a substantially 
purified oligonucleotide. The oligonucleotide typically comprises; a region of nucleotide 

25 sequence that hybridizes under stringent conditions to at least about 12, 15 preferably 
about 20 or 25, more preferably about £0, 50 or 75 consecutive nucleotides of a sense 
strand of one of the sequences set forth, e.g.J in SEQ ID NO: No:2| an anti-sense sequence 
of one of the sequences, e.g., set forth in SEQ ID NO:2, or naturaljy occurring mutants 
thereof. Primers based on a nucleotide; of invention can be used in PCR reactions to clone 

30 acetyl-CoA carboxylase homologues. Probes based on the acetyl-toA carboxylase 

nucleotide sequences can be used to detect transcripts or genomidsequences encoding the 
same or homologous proteins. The probe can further comprise a jabel group attached 
thereto, e.g. : the label group can be a radioisotope, a fluorescent compound, an enzyme, or 
an enzyme co-factor. Such probes can be used as a part of a genoihic marker test kit for 

33 identifying cells which express an acetyl-CoA carboxylase, such as hy measuring a level of 
• ■ » * 

: t » 

£ \ 
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an acetyl-CoA carboxylase-encoding nucleic acid molecule in a sample c f cells e g 
detecting acetyl-CoA carboxylase mRNA levels or determining whether a genomic acetyl- 
CoA carboxylase gene has been mutated or. deleted 

: • j 
The polynucleotide of the invention encodes a polypeptide or portion thereof which in- 
5 dudes an amino acid sequence which is sufficiently homologous 'to an amino acid 
sequence of SEQ ID NO:3 such that the protein or portion thereof maintains an acetyl- 
CoA carboxylase activity, in particular; an afcetyl-CoA carboxylase activity as described in 
the examples in microorganisms or plants, j As used herein, the language "sufficiently 
homologous" refers to proteins or portions thereof which have amino add sequences 
) which include a minimum number of identical or equivalent (e.gt an amino acid residue 
which has a similar side chain as an amino add residue in one of the sequences of the poly- 
peptide of the present invention amino acid residues to an amino; add sequence of SEQ ID 
NO:3 such that the protein or portion thereof has an acetyl-CoA ^carboxylase activity. 
Examples of an acetyl-CoA carboxylase acuity are also described herein 

.1 I 
The protein is at least about 60-65%, preferably at least about 66-Wfc, and more prefer- 
ably at least about 70- 80%, 80-90%, 90-95%, and most preferabljr at least about 96%, 
97%, 98%, 99% or more homologous to anentire amino add sequence of SEQ ID NO:3. 

Portions of proteins encoded by the acetyl-CoA carboxylase polynucleotide of the inven- 
tion are preferably biologically active portions of one of the acety^CoA carboxylase. 

As mentioned herein, the term "biologically; active portion of acer^l-CoA carboxylase" is 
intended to indude a portion, e.g., a domain/motif, that has acetyi-CoA carboxylase activi- 
ty or has an immunological activity such that it is binds to an antibody binding specifically 
to acetyl-CoA carboxylase. To determine whether an acetyl-CoA Carboxylase or a biologi- 
cally active portion thereof can partidpate in the metabolism an aUay of enzymatic activity 
may be performed. Such assay methods arej well known to those skilled in the art, as de- 
tailed in the Examples. Additional nucleic add fragments encoding biologically active por- 
tions of an acetyl-CoA carboxylase canjbe prepared by isolating a portion of one of the 
sequences in SEQ ID NO;2, expressing the encoded portion of the ; acetyl-CoA carboxylase 
or peptide (e.g., by recombinant expression^ vitro) and assessing] the activity of the en- 
coded portion of the acetyl-CoA carboxylase or peptide. ! 



At first, a partial gene fragment was cloned containing a portion of the ACC gene by using 
the degenerate PGR method. Said degenerate PGR is a method to jdone a gene of interest 
which has high homology of amino ad<l sequence to the known enzyme from other species 



i » 

: i-i3- ! 

which has the same or similar function. Degenerate primer, which is used as a primer in 

degenerate PCR, was designed by a reverse'translation of the amino acid sequence to 

corresponding nucleotides ("degenerated"). In such a degenerate primer, a mixed primer 

which consists any of A, C, G or T> or a priiner containing inosine at an ambiguity code is 

5 generally used: In this invention, such mixfed primers were used jfor degenerate primers to 

clone above gene. ; j \ 

* \ 
An entire gene containing its coding region with its intron as welji as its regulation region 

such as a promoter or a terminator can be cloned from a chromosome by screening of a 

genomic library which is constructed in phage vector or plasmid Vector in appropriate 

10 host, by using a partial DNA fragment obtained by degenerate P6R as described above as a 
probe after it was labeled. Generally, R, coli as a host strain and iL coli vector, a phage vec- 
tor such as X phage vector, or a plasmid vector such as pUC vectqr is often used in the 
construction of a library and a following genetic manipulation sufch as a sequencing, a re- 
striction digestion, a ligation and the like. In this invention, an EcoRI genomic library of P, 

15 rhodozyma was constructed in the derivatives of k vector, XZAPII; An insert size, what 
length of insert must be cloned, was determined by the Southern blot hybridization for the 
gene before construction of a library. In this invention, a DNA u|ed for a probe was 
labeled with digoxigenin (DIG), a steroid hapten instead of conventional 32 P label, follow- 
ing the protocol which was prepared by thejsupplier (Boehringer^Mannheim, Mannheim, 

20 Germany). A genomic library constructed from the chromosome of JP. rhodozyma was 
screened by using a DIG -labeled DNA fragment which had a portion of a gene of interest 
as a probe. Hybridized plaques were picked up and used for further study. When A.ZAPII 
(insert size was below 9kb) was used in the construction of fee gehomic library, in vivo ex- 

cision protocol was conveniendy used for the succeeding step of the cloning into the plas- 

> / 

25 mid vector by using a derivative of single stranded M13 phage, Ex; assist phage (Stratagene, 
La Jolla, USA)- A plasmid DNA thus obtained was examined for sequencing. 

In this invention, we used the automated fluorescent DNA sequencer, ALFred system 
(Pharmacia, Uppsala, Sweden) using an autocycle sequencing protocol in which the Taq 
DNA polymerase is employed in most cases of sequencing. • 

[ \ i 

30 After the determination of the genomic sequence, a sequence of a coding region was used 
for a cloning of cDNA of corresponding gerie. The PCR method was also exploited to 
clone cDNA fragment. The PCR primers whose sequences were identical to the sequence 
at the 5'- and 3'- end of the open reading frame (ORF) were synthesized with an addition 
of an appropriate restriction site, and P f CR was performed by using those PCR primers. In 
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this invention, a cDNA pool was usedias a template in this PGR doning of cDNA. The 
said cDNA pool consists of various cDNA spedes which were synthesized in vitroby the 
viral reverse transcriptase and Taq polymerase (CapFinder Kit manufactured by Clontech 
Palo Alto, U.SA.) by using the mRNA obtained from P. rhodozyk* as a template. cDNA ' 
of interest thus obtained was confirmed in its sequence. Furthermore, cDNA thus 
obtained was used for a confirmation of itsjenzyme activity after the doning of the cDNA 
fragment into an expression vector wtiich functions in £ coli under the strong promoter 
activity such as the lac or T7 expression system, j 

; ■ I { 
In another embodiment, the present invention relates to a method for making a recombi- 
nant vector comprising inserting a polynucleotide of the invention into a vector. 

Further, the present invention relates to a recombinant vector containing the polynucleo- 
tide of the invention or produced by said method of the invention. 

: i f 
As used herein, the term "vector" refers to a nucleic add molecule capable of transporting a 
polynucleotide to which it has been linked. I One type of vector is a "plasmid", which refers 
to a circular double stranded DNA loop into which additional DNA segments can be 
ligated. Another type of vector is a viral vector, wherein additional DNA or PNA segments 
can be ligated into the viral genome. Certain vectors are capable of autonomous replica- 
tion in a host cell into which they are introduced (e.g., bacterial vectors having a bacterial 
origin of replication and episomal mammalian vectors). Other vectors (e.g., non-episomal 
mammahan vectors) aire integrated into mSgenome of a host coupon introduction into 
the host cell, and thereby are replicated along with the host genome. Moreover, certain 
vectors are capable of directing the expression of genes to which they are operatively 
linked. Such vectors are referred to herein a* "expression vectors".! In general, expression 
vectors of utility in recombinant DNAtedmiques are often in the^form of plasmids In the 
25 present specification, " pI asmid" and 'Vector!' can be used interchangeably as the plasmid is 
the most commonly used form of vector. Kfowever, the invention' is intended to include 
such otherforms of expression vectors* sue* as viral vectors (e.g., Replication defective 
retroviruses, adenoviruses and adeno-assodated viruses), which serve equivalent 
functions. . I 

: - I 

_ i' i 

The present invention also relates to coWiids, viruses, bacteriophages and other vectors 
used conventionally in genetic engineering that contain a nudeic add molecule according 
to the invention. Methods which are w>ll known to those skilled L the art can be used to 
construct various plasmids and vectors. Alternatively, the nudeiciadd molecules and vec- 
tors of the invention can be reconstituted into liposomes for delivery to target cells 
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The present invention further relates to a vector in which the polynucleotide of the present 
invention is operatively linked to expression control sequences allowing expression in pro- 
karyotic or eukaryotic host cells. The nature of such control sequences differs depending 
upon the host organism. In prokaryotes, control sequences generally include promoter, 
5 ribosomal binding site, and terminators* Iii eukaryotes, generally control sequences in- 
clude promoters, terminators and, in som^ instances, enhancers,jtransactivators; or tran- 
scription factors, i ? ; 

; • s 

The term "control sequence" is intended to include, at a minimum, components the pre- 
sence of which are necessary for expression, and may also include additional advantageous 
10 components. , | 

The term "operably linked" refers to a juxtaposition wherein the components so described 
are in a relationship permitting them ip function in their intended manner. A control 
sequence "operably linked" to a coding sequence is ligated in such a way that expression of 
the coding sequence is achieved under conditions compatible with the control sequences. 
15 In case the control sequence is a promoter, it is obvious for a skilled person that double- 
stranded nucleic acid is used. j \ 

Regulatory sequences include those which direct constitutive expression of a nucleotide 
sequence in many types of host cell an£ those which direct expression of the nucleotide 
sequence only in certain host cells or under.certain conditions. It will be appreciated by 
20 those skilled in the art that the design of the expression vector can depend on such factors 
as the choice of the host cell to be transformed, the level of expression of protein desired, 
etc. The expression vectors of the invention can be introduced into host cells to thereby 
produce proteins or peptides, including fusion proteins or peptides, encoded by poly- 
nucleotides as described herein. 1 ; • • i ■ 

; ; ) 

25 The recombinant expression vectors of the invention can be designed for expression of 
acetyi-CoA carboxylase in prokaryoticor eukaryotic cells. For example, genes encoding the 
polynucleotide of the invention can be!expressed in bacterial cellsjsuch as E. col% insect 
cells (using baculovirus expression vectors); yeast and other fungdl cells, algae, ciliates of 
die types: Holotrichia, Peritrichia, Spirotrichia, Suctoria, Tetrahymena, Paramecium, Colpi- 

30 dium> Glaucoma, Platyophrya, Potomacus, P$eudocohniletnbus, Euphtes, Engelmanietta, and 
Stylonychia, especially Stylonychia ferantoe with vectors following, k transformation method 
as described in WO9801572 and multicellular plant cells. Alternatively, the recombinant 
expression vector can be transcribed and translated in vitro % for example using T7 promo- 
ter regulatory sequences and T7 polymerase. \ 
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Expression of proteins in prokaryotesis most often carried out with vectors containing 
constitutive or inducible promoters directing the expression of either fusion or non-fusion 
proteins. Fusion vectors add a number of amino acids to a protein encoded therein, usual- 
ly to the amino terminus of the recombinant protein but also to the C-terminus or fused 

5 within suitable regions in the protein Such fusion vectors typically serve three purposes- 
1) to increase expression of recombinant pjrotein; 2) to increase the solubility of the 
recombinant protein; and 3) to aid in the purification of the recombinant protein by 

P Often i in fusion expression vectors, a proteolytic 

cleavage site is introduced at the junction of the fusion moiety an : 'd the recombinant 

> protein to enable separation of the recombinant protein from the fusion moiety 
subsequent to purification of the fusion protein. Such enzymes, ind their cognate 
recognition sequences, include Factor Xa, thrombin and enteroklnase. 

Typical fusion expression vectors include pGEX (Pharmacia BiotU Inc.), pMAL (New 
England Biolabs, Beverly, USA) and pKITsfaharmacia, Piscatawky, USA) which fuse glu- 
tathione S-transferase (GST), maltose fc binding protein, or protein A, respectively, to the 
target recombinant protein. In one embodiment, the coding sequence of the polypeptide 
encoded by the polynucleotide of the present invention is cloned,Wo a pGEX expression 
vector to create a vector encoding a fusion protein comprising, from the N-terminus to the 
^terminus, GST-thrombin cleavage site-X protein. The fusion protein can be purified by 
affinity chromatography using glutathioneiagarose resin, e.g. recombinant acetyl-CoA 
carboxylase unfused to GST can be recovered by cleavage of the fiUm protein with 
thrombin. f. t* 

Examples of suitable inducible non-fusion R coli expression vectors include pTrc and pET 
lid. Target gene expression from the pTrc [vector relies on host RNA polymerase tran- 
scription from a hybrid trp-lac fusion promoter. Target gene expression from the pET lid 
vector relies on transcription from a T7 gn XO-lac fusion promoted mediated by a coex- 
pressed viral RNA polymerase (T7gnty. Thi 5 viral polymerase is supplied by host strains 

BL21(DE3)orHMSl74(DE3)fromaresidentXpropha g eharbo4gaT7^Igeneunder 
the transcriptional control ofthetecUV 5 promoter. j 

: - i 

One strategy to maximize recombinant protein expression is to express the protein in host 
bacteria with an impaired capacity to proteolytically cleave the recombinant protein An- 
other strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an 
expression vector so thatthe individual codons for each amino add are those preferentially 
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utilised in the bacterium chosen for expression, such as & colL s\ich alteration of nucleic 
acid sequences of the invention can be carried out by standard DNA synthesis techniques. 

Further, the acetyl-CoA carboxylase vector can be a yeast expression vector. Examples of 
vectors for expression in yeast S. cerevisiae include pYepSecl, pMFa, pJRY88, andpYES2 
5 (Invitrogen, San Diego, USA). Vectors and: methods for the construction of vectors appro- 
priate for use in other fungi, such as the filamentous fungi, are known to the skilled arti- 
san; 

i \ \ 

Alternatively, the polynucleotide of the invention can be introduced in insect cells using 
baculovirus expression vectors* Baculdvirus vectors available for (expression of proteins in 
10 cultured insect cells (e.g., Sf 9 cells) include the pAc series and the pVL series. 

i ! 1 

Alternatively, the polynucleotide of the invention is introduced in mammalian cells using a 
mammalian expression vector. Examples of mammalian expression vectors include 
pCDM8 and pMT2PC When used in inanimalian cells, the expression vector'is control 
functions are often provided by viral regulatory dements. For exkmple, commonly used 
15 promoters are derived from polyoma, Adenovirus 2, cytomegalovirus and Simian Virus 
40. II \ 

i « ; 

The recombinant mammalian expression vector can be capable of directing expression of 
the nucleic acid preferentially in a particular cell type (e.g., tissuejspecific regulatory ele- 
ments are used to express the nucleic tfcid).;! Tissue- specific regulatory elements are 

20 known in the art Non-limiting examples of suitable tissue-specific promoters include the 
albumin promoter (liver* specific), lymphoid-specific promoters^ in particular promoters 
of T cell receptors and immunoglobulins, neuron-specific promoters (e.g.j the 
neurofilament promoter)* pancreas-specific promoters, and marrfmary gland-specific 
promoters (e.g., milk whey promoter; US 4*873, 316 and EP 264,166). Developmental^- 

25 regulated promoters are also encompassed, for example the murine hox promoters and the 

i ■ i 

fetoprotein promoter. \ [ 

i ] 

Thus expressed ACG gene can be verified for its activity, e*g., by ah enzyme assay method. 
Some experimental protocols are described in the literature. The following is the one of 
the methods which is used for the determination of acetyl-CoA carboxylase activity: Assays 
30 are performed by measuring the loss in acetyl-CoA and/or the production of malonyl-CoA 
at 5 min intervals for 20 min, using reverse phase HPLC. The ratejof conversion of acetyl- 
CoA to malonyl-CoA is found to be linear for 20 min, and velocities are calculated by line- 
ar regression analysis of the malonyl-CoA concentration with respect to time. The 

! I 



; : 92 = 01 * d » ^ * / 2 iiezssuB|dW3 

• j 

i :-18- j 
reaction mixture contained 50 mM Tft, pfi 7.5, 6 M M acetyl-CoA, 2 mM ATP, 7 mM 
KHCO* 8mM MgCl* 1 mM dithiothreitoi, and 1 mg/ml bovine?serum albumin. Enzyme 
is preincubated (30 min, 25*C) with bovin^ serum albumin (2 mg/ml) and potassium 
citrate (10 mM). Reactions are initiated by transferring 50 ul of preincubated enzyme to 
5 the reaction mixture (final volume 20Q ul) and incubated for 5-20 min at 25'C. Reactions 
are terminated by addition of 50 ul 10% perchloric acid. Following termination of the 
reaction, the samples are centrifuged (3 min, 10,000 X g) and analyzed by HPLC A mobile 
phase of 10 mM KH 2 P0 4 , pH 6.7 (solvent A), and MeOH (solvent B) is used. The flow 
rate is 1.0 ml/min, and the gradient is as follows: hold at 100% solvent A for 1 min 
) followed by a linear gradient to 30% solvent B over the next 5 min, then hold at 30% 
solvent B for 5 min. Using this method the 5 retention times were 7.5 and 9.0 min for 
malonyl-CoA and acetyl-CoA, respectively/ When an expression 5 vectorfor S. ceremiaeis 
used, a complementation analysis can be conveniently exploited by using conditional 
acetyl-CoA carboxylase null mutant sthun derived from S. cere^ae as a host strain for its 
confirmation of activity. ; f 

j j 
Succeeding to the confirmation of thefenzyme activity, an expresjed protein would be 
purified and used for raising the antibody against the purified enzyme. Antibody thus 
prepared would be used for a characterization of the expression o'f the corresponding 
enzyme in a strain improvement study, an Optimization study of jhe culture condition 
and the like. j 

In a further embodiment, the present invention relates to an antibody that binds 
specifically to the polypeptide of the pfesen* invention or parts, i.j, specific fragments or 
epitopes of such a protein. I j [ 



The antibodies of the invention can beiused, to identify and isolate other acetyl-CoA carb^ 
oxylase and genes. These antibodies can beimonoclonal antibodies, polyclonal antibodies 
or synthetic antibodies as well as fragnients.of antibodies, such aslFab, Fv or scFv frag- 
ments etc Monoclonal antibodies canibe prepared, for example,^ the techniques as ori- 
ginally described by Kohler and Mflstein, wjiich comprise the fusion of mouse myeloma 
cells to spleen cells derived from immunized mammals. ' 

i » ! 

Furthermore, antibodies or fragments thereof to the aforementioned peptides can be ob- 
tained by using methods known to the.skiUed person. These antibodies can be used, for 
example, for the immunopredpitationiand immunolocalization df proteins according to 
the invention as well as for the monitoring of the synthesis of suc^ proteins, for example 
in recombinant organisms, and for the identification of compounds interacting with the' 




protein according to the invention. For example, surface plasmcjn resonance as employed 
in the BlAcore system can be used to increase the efficiency of phage antibodies selections, 
yielding a high increment of affinity from i single library of phage antibodies which bind 
to an epitope of the protein of the invention. In many cases, the binding phenomenon of 
antibodies to antigens is equivalent tolothe* ligand/anti-ligand binding. 

In this invention, the gene fragment for acetyl-CoA carboxylase Ws cloned from P. rhodo- 
zyma with a purpose to decrease its expression level in P. rhodozyma by genetic method 
using me cloned gene fragment. ) \ 

To decrease a gene expression with geneticjmethods, some strategies can be employed, one 
of which is a gene-disruption method.; In tjbiis method, a partial fragment of the objective 
gene to be disrupted is ligated to a dnig resistant cassette on the integration vector which 
can not replicate in the host organism! A drug resistance gene which encodes the enzyme 
that enables the host to survive in the presence of a toxic antibiotic is often used for the 
selectable marker. G418 resistance gene harbored in pGB-Ph9 (Wery et al (Gene, 184, 89- 
97, 1997)) is an example of a drug resistance gene which functions in P. rhodozyma. 
Nutrition complementation marker can belalso used in the host which has an appropriate 
auxotrophy marker. P. rhodozyma ATCC24221 strain that requires cytidine for its growth 
is one example of the auxotroph. By using CTP synthetase as donor DNA for ATCC24221, 

a host vector system using a nutrition complementation can be established. 

? f k 

After the transformation of the host organisms and recombination between the objective 
gene fragment on the vector and its corresponding gene fragment on the chromosome of 
the host organisms, the integration vector is integrated onto the liost chromosome by 
single cross recombination. As a result of tins recombination, the drug resistant cassette 
would be inserted in the objective gene whdse translated produces only synthesized in its 
truncated form which does not have its enzymatic function. In a jsimilar manner, two 
parts of the objective gene were also used for gene disruption study in which the drug 
resistant gene can be inserted between:such two partial fragments; of the objective genes on 
the integration vector. In the case of this tyjpe of vector, double recombination event 
between the gene fragments harboTed bn the integration vector and the corresponding 
gene fragments on the chromosome of the host are expected. Although frequency of this 
double crossing-over recombination is lower than single cross recombination, null 
phenotype !of the objective gene by the double cross recombination is more stable than by 
the single cross recombination. . ; I 
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On the other hand, this strategy- has difficulty in the case of the gene whose function is 
essential and disruption is lethal for the host organism such as acletyl-CoA carboxylase 
gene. The function of acetyl-CoA carboxylase is indispensable far the host survival other 
than the biosynthesis of forty acid. From such a viewpoint, it seemed to be difficult to con- 
struct the acetyl-CoA carboxylase disrpptant from P. rhodozyma by this gene disruption 

method. \ \ 

* ? 
i j 

In such a case, other strategies can be applied to decrease (not to disrupt) a gene expres- 
sion, one of which is a conventional mutagenesis to screen the mutant whose expression 
for acetyl-CoA carboxylase is decrease/i. In this method, an appropriate recombinant in 
which an appropriate reporter gene is fused to the promoter region of acetyl-CoA carb- 
oxylase gene from the host organism is mutated and mutants which show a weaker activity 
of reporter gene product can be screened. In such mutants, it is expected that their expres- 
sion of acetyl-CoA carboxylase activity decreased by the mutation lying in the promoter 
region of reporter gene or trans-acting regipn which might affectithe expression of acetyl- 
CoA carboxylase gene other than the mutation lying in the prompter gene itself. In the 
case of mutation occurring at the promoter region of the reporter fusion, such mutation 
can be isolated by the sequence of the Corresponding region. Tmis isolated mutation can 
be introduced in a variety of carotenoids, especially astaxanthin producing mutants 
derived from P. rhodozyma by a recombination between the original promoter for acetyl- 
CoA carboxylase gene on the chromosome fcnd the mutated promoter fragment. To 
exclude mutations occurring at a tra ns-act&g region, a mutauon jcan also be induced by an 
in vitro mutagenesis of a cis element in the promoter region. In tjiis approach, a gene 
cassette, containing a reporter gene which is fused to a promoter region derived from a 
gene of interest at its 5'-end and a terminatpr region from a genejof interest at its S'-end, is 
mutagenized and then introduced into P. rhodozyma. By detecting the difference of the 
activity of the reporter gene, an effective mutation can be screened. Such a mutation can 
be introduced in the sequence of the native promoter region on ie chromosome by the 
same method as the case of an in vivo imitation approach. But, these methods have some 
drawbacks to have some time-consuming process. * 

;' i * • 

Another strategy to decrease a gene expression is an antisense method. This method is fre- 
quently applied to decrease the gene expression even when teleomorphic organisms such 
as P. rhodozyma are used as host organisms', to which the mutatidn and gene disruption 
method is usually difficult to be applied. The anti-sense method \s a method to decrease 
an expression of gene of interest by introducing an artificial gene|fragment, whose 
sequence is complementary to cDNA fragment of the gene of interest. Such an anti-sense 
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gene fragment would form a complex with; a mature xnRNA fragpient of the objective gene 
in vivo and inhibit an efficient translation from mRNA, as a consequence. 

! i 

t 

; 1 

An "antisense" nucleic acid molecule comprises a nucleotide sequence which is comple- 
mentary to a "sense" nucleic acid mol6cule' t encoding a protein, e{ g., complementary to the 
5 coding strand of a double-stranded cDNA .molecule or complementary to a mRNA 

sequence. Accordingly, an antisense nucleic add molecule can hydrogen bond to a sense 
nucleic acid molecule. The antisense nucleic acid molecule can lie complementary to an 
entire acetyl-CoA carboxylase-coding strarid, or to only a portion thereof. Accordingly, an 
antisense nucleic acid molecule can be antisense to a "coding region" of the coding strand 

10 of a nucleotide sequence encoding an acetyl-CoA carboxylase. Tlie term "coding region" 
refers to the region of the nucleotide sequence comprising codoiis which are translated 
into amino acid residues. Further, the antisense nucleic acid molecule is antisense to a 
"noncoding region" of the coding strand of a nucleotide sequence encoding acetyl-CoA 
carboxylase. The term "noncoding region" refers to 5 1 and 3' sequences which flank the 

15 coding region that are not translated into a; polypeptide (i.e., also referred to as 5* and 3 1 
untranslated regions). ; 1 

* i ^ ^ 

Given the coding strand sequences encoding acetyl-CoA carboxylase disclosed herein, 

antisense nucleic acid molecules of the invention can be designed according to the rules of 

Watson and Crick base pairing. The antisense nucleic acid molecule can be complement 

20 tary to the entire coding region of acetyi-CpA carboxylase mRNA, but can also be an oligo- 
nucleotide which is antisense to only a portion of the coding or Tjoncoding region of 
acetyl-CoA carboxylase mRNA. For example, the antisense oligonucleotide can be com- 
plementary to the region surrounding the translation start site of acetyl-CoA carboxylase 
mRNA. An antisense oligonucleotide :.can be, for example, aboutSS, 10, 15, 20, 25, 30, 35* 

25 40, 45 or 50 nucleotides in length. An. antisense nucleic acid molecule of the invention can 
be constructed using chemical synthesis and enzymatic ligation reactions using procedures 
known in the art, For example, an antisense nucleic acid molecule (e.g., an antisense 
oligonucleotide) can be chemically synthesized using naturally occurring nucleotides or 
variously modified nucleotides designed to'increasc the biological stability, of the molecules 

30 or to increase the physical stability of the duplex formed between! the antisense and sense 
nucleic acids, e.g.» phosphorothioate derivatives and acridine substituted nucleotides can 
be used. Examples of modified nucleotides which can be used to -generate the anti-sense 
nucleic acid include 5-fluorouracil, 5-bromouracil, 5-chlorouradl, 5-iodouracil, hypo- 
xanthine, xanthine, 4-acetylcytosine, 5-(car>oxyhydroxylmethyl)i uracil, 5-carboxymethyl- 

35 aminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta- 
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D-galactosylqucosine, inosine, N6-isopentenyIadenine, 1-methyjguanine, 1-methylinosine, 
2,2-dimethylguanine, 2-methyladenine,, 2-methylguanirie, 3-memylcytosine, 5-methyl- 
cytosine, N6-adenine, 7-methylguanine, S^methylaminomethyluxacU, 5-memoxyarxrino- 
methyl-2-thiouracil, beta-D-mannosylquedsine, S^xnetho^carbbxymethyluradl, 5-meth- 
5 oxyuracil, 2-methy]tbio-N6-isopent€nyladpine, uracfl-5 : oxya«itic add (v), 

wybutoxosine, pseudouracil, queosin* 2-thiocytosine, 5-methyl^thiouracfl, 2-thiouracil, 
4-thiouracil, 5-methyluradl, uracU-S-bxyatetic add methylesteriuradl-5-oxyacetic add 
(v), 5-me*yl-2-thiouraciI, ^-CS-amino-S-N^-carboxypTopyl) ukdl, (acp3)w, and 2,6- 

diaminopurroe. Alteriwtively, the antisense nucleic add can be produced biologically 
3 using an expression vector into which* a poiynudeotide has been-subcloned in an antisense 

orientation (i.e.; UNA transcribed from fli4 inserted polynucleotide will be of an antisense 

orientation to a target poiynudeotide of interest, described further in the following 

subsection), i ' \ 

. : f \ . 

The annsense nudeic add molecules of the invention are typicalfy administered to a cell or 
generated in situ such that they hybridize with or bind to cellular! mRNA and/or genomic 
DNA encoding an acetyl-CoA carboxylase to thereby inhibit expression of the protdn, e g 
bywMbitingtranscription^^^ The hybridization can be by conventional ' 

nudeotide complementarity to form a stable duplex, or, for example, in the case of an 
antisense nucldc add molecule which bini to DNA duplexes, through spedfic 
interactions in the major groove of th^ double helix. The anti-sense molecule can be 
modified such that it spedficaUy binds to aVeceptor or an antigen expressed on a selected 
cell surface, e.g., by Unking the antisense nudeic add molecule to! a peptide or an antibody 
which binds to a cdl surface receptor or antigen. The antisense ijudeic add molecule can 
also be delivered to cdls using the vectors described herein. To achieve suffident 
intraceUular concentrations of the anti'sensfe molecules, vector constructs in which the 
antisense nucleic add molecule is placed under the control of a strong prokaryotic, viral, 
or eukaryotic induding plant promoters are preferred I 

e j 

The antisense nudeic acid molecule of tiie invention may, e.g., bi an D>anomerk nucleic 
acid molecule. An a-anomeric nudeic aci<j molecule forms speq j fic double-stranded 
hybrids with complementary RNA in which, contrary to the usual 0-units, the strands run 
paralld to each other. The antisense niiddc acid molecule can also comprise a 
2'-o-methylribonucleotide or a chimeric RNA-DNA analogue, f 

i 

Further the antisense nudeic add molecule of the invention can |>e a ribozyme. Ribozymes 
are catalytic UNA molecules with ribonudease activity which areUpable of cleaving a 
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single-stranded nudeic acid, such as a mRNA, to which they hare a complementary 
region. Thus, ribozymes (e.g-> hammerhead ribozymes) can be used to catalytically deave 
acetyl-CoA carboxylase mRNA transcripts <to thereby inhibit translation of mRNA. A 
ribozyme having specificity for an acetyl-CoA carboxylase-encoding nucleic acid molecule 

5 can be designed based upon the nucleotide sequence of an acetyl-CoA carboxylase cDNA 
disdosed herein or on the basis of a heterologous sequence to be^isolated according to 
methods taught in this invention. For exarnple, a derivative of a {Tetrahymena L-19 IVS 
RNA can be constructed in which the nucleotide sequence of ttreactive site is complemen- 
tary to the nudeotide sequence to be cleaved in an encoding mRNA (see, e.g., US 

10 4,987,071 and US 5,1 16,742). Alternatively, acetyl-CoA carboxylase mRNA can be used to 
select a catalytic RNA having a specific ribonuciease activity from a pool of RNA 
molecules. i ) • 

i '• i 

The application of the antisense method to construct a carotenoiid overproducing strain 
from P. rhodozyma is disclosed in EP i,158i051. ) 

r i 

J 5 In one embodiment the present invention relates to a method of making a recombinant 

■ ? 

host cell comprising introducing the vector or the polynucleotide of the present invention 

into a host cell, ! • 

j 

r i 

; \ i 
Vector DNA can be introduced into plrokaryotic or eukaryotic cdls via conventional trans- 
formation or transfection techniques. As u?ed herein, the terms ,, ^ransformation 11 and 

20 "transfection 1 ', conjugation and transduction are intended to refer to a variety of art- 
recognized techniques for introducing foreign nucleic acid (e.g., pNA) into a host cell, 
including calcium phosphate or calcium chloride co-precipitation, DEAE-dextrao- 
mediated transfection, lipofection, natural competence, chemical-mediated transfer, or 
electroporation. Suitable methods for. transforming or transfectijag host cells including 

25 plant cells are known to the skilled artisan, j I 

For stable transfection of mammalian :cells, : only a small fraction of cells may integrate the 
foreign DNA into their genome, depending upon the expression vector and transfection 
technique used. In order to identify and select these integrants, a;gene that encodes a 
selectable marker (e.g., resistance to antibiotics) is generally introduced into the host cells 
30 along with the gene of interest. Preferred selectable markers include those which confer 
resistance to drugs, such as G418, hygromyjcin and methotrexate/ Nucleic acid encoding a 
selectable marker can be introduced into a host cell on the same vector as that encoding 
the polypeptide of the present invention or-can be introduced on; a separate vector. Cells 
stably transfected with the introducednucleic acid can be identified by, for example, drug 
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selection (e.g., cells that have incorporated the selectable marker gene will survive, while 
the other cells die) J 

i * • t 

To create a homologous recombinant microorganism, a vector is prepared which contains 
at least a portion of the polynucleotide of the present invention into which a deletion, 
addition or substitution has been introduced to thereby alter, e.g|, functionally disrupt, the 
acetvl-CoA carboxylase gene. Preferably, this acetyl-CoA carboxylase gene is a P. rhodo- 
zyma acetyl-CoA carboxylase gene, but it can be a homologue from a related or different 
source. Alternatively, the vector can be designed such that, uponi homologous recombina- 
tion, the endogenous acetyl-CoA carboxylase gene is mutated or otherwise altered but still 
encodes a functional protein (eig., the.upstream regulatory region can be altered to thereby 
alter the expression of the endogenous acetyl-CoA carboxylase), ho create a point muta- 
tion via homologous recombination also dW-KNA hybrids caribe used known as 
chimeraplasry known from Cole-Straijss eial, Nud. Aci. Res., 2% s, 1323-1330, 1999 and 
Kmiec, Gene therapy., American Scientist. 87, 3, 240-247. 1999. i 

4 

The vector is introduced into a cell and cel(s in which the introduced polynucleotide gene 
has homologously recombined with the endogenous acetyl-CoA carboxylase gene are 
selected, using art-known techniques.: ? " • f 

■ t 

t i 

Further host cells can be produced which contain selection systeihs which allow for regula- 
t fl^ reS - i ° n ° fth ° « ltroduced e ene - ForJ example, inclusion of the polynucleotide of the 
invention on a vector placing it underi control of the lac operon permits expression of the 
polynucleotide only in the presence of IPTG. Such regulatory systems are well known in 
the art. ■ ■ i ? 

'. i i 
' ! ' i 
Preferably, the introduced nucleic acid molecule is foreign to the host cell. 

• } 

By "foreign" it is meant that the nucleic add molecule is either heterologous with, respect 
to the host cell, this means derived frcun a cell or organism with * different genomic back- 
ground, or is homologous with respect to the host cell but located in a different genomic 
environment than the naturally occurring counterpart of said nucleic acid molecule. This 
means that, if the nudeic add molecule is homologous with respect to the host cell, it is 
not located in its natural location in tie genome of said host cellj in particular it is sur- 
rounded by different genes. In this case the nuddc acid molecule may be either under the 
control of its own promoter or under the control of a heterologous promoter. The vector 
or nucleic add molecule according to 'the invention which is present in the host cell may 
dther be integrated into the genome of the host cell or it may bemaintained in some form 
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extrachromosomally. In this respect, it is also to be understood that the nucleic acid mole- 
cule of the invention can be used to restore or create a mutant gene via homologous re- 
combination. • 

Accordingly, in another embodiment the present invention relates to a host cell genetically 
5 engineered with the polynucleotide of the invention or the vector of the invention, 

i ' 

The terms "host cdF and 1 'recombinant host cell" are used interchangeably herein. It is 
understood that such terms refer not only to the particular subject cell but to the progeny 
or potential progeny of such a cell. Because certain modifications may occur in succeeding 
generations due to either mutation or environmental influences^such progeny may not, in 
10 fact, be identical to the parent cell, but are still included within the scope of the term as 
used herein. j \ 

: • i 

For example, a polynucleotide of the present invention can be introduced in bacterial cells 
as well as insect cells, fungal cells or mammalian cells (such as Chinese hamster ovary cells 
(CHO) or COS cells), algae, ciliates, plant cells, fungi or other microorganims like E. coli. 
15 Other suitable host cells are known to those skilled in the art. Preferred are R coli, baculo- 
virus, Agrobacterium or fungal cells a*e, for example, those of th6 genus Saccharomyces, 
e.g. those of the species S. eerevisiae or P. rhodozyma (Xanthophylomyces dendrorhous). 

In addition, in one embodiment, the present invention relates to?a method for the produc- 
tion of fungal transformants comprising the introduction of the polynucleotide or the vec- 
20 tor of the present invention into the genome of said fungal cell ; 

: i 
' i 

For the expression of the nucleic acid molecules according to the invention in sense or 
antisense orientation in plant cells, the molecules are placed under the control of regula- 
tory elements which ensure the expression in fungal cells. These regulatory elements may 
be heterologous or homologous with respect to the nucleic acid molecule to be expressed 
25 as well with respect to the fungal species to be transformed. i 

In general, such regulatory elements comprise a promoter active in fungal cells. To obtain 
constitutive expression in fungal cells, preferably constitutive promoters are used, e.g., the 
gJyceraldehyde-3-dehydrogenase promoter 'derived from P. rhodozyma (WO 97/23,633). 
Inducible promoters may be used in order to be able to exacdy control expression. An 
30 example for inducible promoters is the promoter of genes encoding heat shock proteins. 
Also an amylase gene promoter which is a candidate for such inducible promoters has 
been described (EP 1,035,206). The regulatory elements may further comprise transcrip- 
tional and/or translations! enhancers functional in fungal cells. Furthermore, the regula- 
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tory dements may include transcription termination signals, su<fh as a poIy-A signal, 
which lead to the addition of a poly A- tail fo the transcript which may improve its stability. 

Methods for the introduction of foreign DNA into fungal cells are also well known in the 
art. These include, for example, transformation with the LiCl method, the fusion of proto- 
i plasts, electroporation, biolistic methdds ISke particle bombardment other methods known 
in the art. Methods for the transformationlusing biolistic methods are well known to the 
person skilled in the art. 

* i 

The term "transformation" as used herein, refers to the transfer of an exogenous poly- 
nucleotide into a host cell, irrespective of the method used for the transfer. The poly- 
nucleotide may be transiently or stably introduced into the host tell and may be main- 
tained non- integrated, for example, as a plasmid or as chimeric iixiks t or alternatively, may 
be integrated into the host genome. 

i i i 
In general, the fungi which can be modified according to the invention and which either 
show overexpression of a protein according to the invention or aj reduction of the synthesis 
of such a protein can be derived from 'any desired fungal species.; 

Further, in one embodiment, the present invention relates to a fungal cell comprising the 
polynucleotide the vector or obtainable by the method of the present invention. 

Thus, the present invention relates also to transgenic fungal cells Which contain (preferably 
stably integrated into the genome) a polynucleotide according toime invention linked to 
regulatory elements which allow expression of the polynucleotide in fungal cells and 
wherein the polynucleotide is foreign %o the transformed fungal cell. For the meaning of 
foreign; see supra. , I 

Thus, the present invention also relates to transformed fungal cells according to the inven- 



tion. 



Accordingly, due to the altered expression of acetyl-CoA carboxylase, cells metabolic path- 
ways are modulated in yield production, and/or efficiency of production. 

The terms "production" or "productivity" are art-recognized and include the concentration 
Of the fermentation product (for example fatty acids, carotenoids, (polysaccharides, 
lipids, vitamins, isoprenoids, wax esters, and/or polymers like polyhydroxyalkanoate's 
and/or its metabolism products or further desired fine chemical as mentioned herein) 
formed within a given time and a given fermentation volume (e.g., kg product per hour 
per liter). ; 
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* 

Ths term "efficiency" of production include the time required ftir a particular level of pro- 
duction to be achieved (for example, how long it takes for the cell to attain a particular rate 

of output of a said altered yield, in particular, into carotenoids, (polysaccharides, lipids, 

* * 

vitamins, isoprenoids etc.). 

5 The term "yield" or "product/carbon yield" .is art-recognized and includes the efficiency of 
the conversion of the carbon source into the product (Le. acetyl CoA, fatty acids, vitamins, 
carotenoids, isoprenoids, lipids etc and/or ftxrther compounds as defined above and 
which biosynthesis is based on said products). This is generally written as, for example, kg 
product per kg carbon source. By increasing the yield or production of the compound, the 
10 quantity of recovered molecules, or of useful recovered molecules of that compound in a 

given amount of culture over a given amount of time is increased, 

« * 

The terms "biosynthesis 1 ' (which is used syiionymously for "synthesis" of "biological pro- 
duction" in cells, tissues plants, etc.) or a "biosynthetic pathway" are art-recognteed and 
include the synthesis of a compound, preferably an organic compound, by a cell from 

15 intermediate compounds in what raay:be ajmultistep and highly regulated process. 

\ ' * 

The language "metabolism" is art-recognized and includes the totality of the biochemical 
reactions that take place in an organism. Tjbe metabolism of a particular compound, then, 
(e.g., the metabolism of acetyl CoA, a fatty kcid, hexose, isoprencfid, vitamin, carotenoid, 
lipid etc.) comprises the overall biosynthette, modification, and degradation pathways in 
20 the cell related to this compound. ; * 2 

- r 

Such a genetically engineered P. rhodozymd would be cultivated in an appropriate medium 
and evaluated in its productivity of carotenoids, especially astaxahthin. A hyper producer 
of astaxanthin thus selected would be confirmed in view of the relationship between its 
productivity and the level of gene or protein expression which is introduced by such a 
25 genetic engineering method. j 

The present invention is further illustijated with Examples described below. 

i . i : , 

The following materials and methods employed in the Examples are described below: 

» 

Strains , 

P. rhodozvma. ATCC96594 (re-deposited urider the accession No.' ATCC 74438 on April 8, 
30 1998 pursuant to the Budapest Treaty) 

RcoKDH5a : F, <j>80d, JucZAMlS, A(/flcZYA-«rfF)Ul69, hsd (ntf I m K + ), recAl, mdAl, 
<feoR> thi-l t 5wpE44,^yrA96 > rdAl (Toyobo;, Osaka, Japan) 
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£ , <*>Z*XLl-BlueMRF> ; A(mcrA)l83i A( wcrCB- WSMR-mrr) 173; e „dAl, supHAA, thi-l> 
recAl, gyrA96,relAl, lac [F praAB, 7a f IqZAM15, Tnlb (utT)) (Sfratagene, La Jolla, USA) 

mKSOLR ; *14-(woA), AfwaCB-^SMR-m^m, sbcC, recB, recj, utnuC :; Tn5(kan r ), 
uvrC, lac, gyrA96, re/Al, endAl ^ABL, [P proAB, fedqZ AMI 5] Su-(nonsu PP ressmg) ' 
5 (Stratagene) • 

E,q>/*TOPlQ ; wcrA, Amrr-ft5<miS-mcrBC;, AfocZMlS; AfeeX74, recAl, deoR, 
araDl39, (ara-Ieu)7697, gatU, gafc, rpsL (Str<), ertdAl, nupG (Intttrogen, Carlsbad, USA) 

Vectors '; 

XZAPII (Stratagene) ! -. .: .. j .'_ 

io pBluescriptllKS- (Stratagene) \ j ; 

pMOSBlue T-vector (Amersham, Buckinghamshire, U.K.) 
PCR2.1-TQPO (Invitrogen) 

■ : ' j 

Media • i 

P. rhodozyma strain was maintained routinely in YPD medium (DIFCO, Detroit, U.S.A.). 
15 £ coli strain was maintained in LB medium (10 g Bacto-tryptpn,*! g yeast extract (DIFCO) 
and SgNad per liter). NZY medium (5 g Nad, 2 gM g S0 4 -7H 2 b, 5 g yeast extract 
(DIFCO), 10 g NZ amine type A (WAjCO, Osaka, Japan) per liter) is used for X phage pro- 
pagation in a soft agar (0.7 % agar (WAKO)). When an agar medium was prepared, 1.5 % 
ofagar(WAKO) was supplemented. . "? '$ 

20 Methods 

Restriction enzymes and T4DNA ligase were purchased from Takara Shuzo (Ohtsu, 
Japan), • f 

Isolation of a chromosomal DNA from P. rhodozyma was performed by using QIAGEN 
Genomic Kit (QIAGEN, Hilden, Germany)! following the protocol supplied by the manu- 
facturer. Mini-prep of plasmid DNA from transformed R coli was performed with the 
Automatic DNA isolation system (PI-50, Kurabo, Co. Ltd., Osaka, Japan). Midi-prep of 
plasmid DNA from an E coli transformant was performed by usitig QIAGEN column 
(QIAGEN). Isolation of X DNA was performed by Wizard lambda preps DNA 
purification system (Promega, Madisoji, USA.) following the protocol prepared by the 
manufacturer. A DNA fragment was isolated and purified from agarose by using 
QlAquick or QIAEX II (QIAGEN). Manipulation oiX phage derivatives was followed by 
me protocol prepared by the manufacturer (Stratagene). 
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Isolation of total RNA from P. rhodozyma was performed with the phenol method by using 
Isogen (Nippon Gene, Toyama, Japan), iriRNA was purified froin total RNA thus ob- 
tained by using mRNA separation kit (Clontech). cDNA was synthesized by using 
CapFinder cDNA construction kit (Clontech). 
5 In vitro packaging was performed by using. Gigapaclc III gold packaging extract (Strata- 
gene). ; 5 

The polymerase chain reaction (PCR) ; was performed with the thermal cycler from Perkin 
Elmer model 2400. Each PCR condition is described in examples. PCR primers were pur- 
chased from a commercial supplier. Fluorescent DNA primers for DNA sequencing were 
10 purchased from Pharmacia. DNA sequencing was performed with the automated fluores- 
cent DNA sequencer (ALFred, Pharmada), 
Competent cells of DH5a were purchased from Toyobo (Japan); 

Example 1: Isolation of mRNA from P. rhodozyma and construction of cDNA library 

j 

To construct cDNA library of P. rhodozyma* total RNA was isolated by phenol extraction 
15 method right after the cell disruption and the mRNA from P. rhodozyma ATCC96594 
strain was purified by using mRNA separation kit (Clontech) . 

At first, Cells of ATCC96594 strain from 10 ml of two-day-cultutfe in YPD medium were 
harvested by centrifugation (1500 x g for l6 min.) and washed oAce with extraction buffer 
(10 mM Na~citrate / HC1 (pH 6.2) containing 0.7 M ICQ). After Suspending in 2,5 ml of 
20 extraction buffer, the cells were disrupted by French press homogenizer (Ohtake Works 
Corp-, Tokyo, Japan) at 1500 kgf/cm2 and immediately mixed with two times of volume of 
isogen (Nippon gene) according to the method specified by the manufacturer. In this step, 
400 jig of total RNA was recovered. ; • 

Then> this total RNA was purified by Using^niRNA separation kit (Clontech) according to 
25 the method specified by the manufacturer. ; Finally, 16 jig of mRNA from P. rhodozyma 
ATCC96594 strain was obtained, j 
To construct cDNA library, CapFinder PCR cDNA construction kit (Clontech) was used 
according to the method specified by the manufacturer. One ug of purified mRNA was 
applied for a first strand synthesis followedby PCR amplification. After this amplification 
30 by PCR, 1 mg of cDNA pool was obtained. , 

Example 2s Cloning of a partial ACQ (acetyl-CoA carboxylase) gene from P. rhodozyma 
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To done a partial ACC gene from P. rhodozyma, a degenerate PGR method was exploited. 
Species and accession number to database whose sequence for aqetyl-CoA carboxylase 
were used for multiple alignment analysis are as follows. 
Arabidopsis thaliana D34630 (DDBJ) 

Emericella niduJans Y15996 (EMBL) 

Galltts gallus PU029 (Swiss-Prot) 

Glycine max L48995 (GenBank) 

Homo sapiens S41121 (PIR) 

Medieago sativa L25042 (GenBank) ] 

Ovis ones Q28559 (Swiss-Prot) 

Rattus norvegicus Pi 1497 (Swiss-Prot) 

Saccharomyces cerevisiae Q00955 (Swiss-Prot) : 

Schizosaccharomyces pombe P78820 (SWiss-Prot) 
Ustilago maydis S49991 (PIR) 

Two mixed primers whose nucleotide sequences were designed ajid synthesized based on 
the common sequence of known acetyi-CoA carboxylase genes from other species; acc9 
(sense primer) (SEQ ID NO:4) and acclS (antisense primer) (SE0 ID NO:5) (in the 
sequences V means nucleotides a, c, g or tj "h" means nucleotides a, c or t, H m" means 
nucleotides a or c, "k" means nucleotides g or t, and "y" means nucleotides c or t). 
After the PCR reaction of 23 cycles of-95°C;for 30 seconds, 45°C &r 30 seconds and 72°C 
for 15 seconds by using ExTaq (Takara Shuzo) as a DNA polymerase and cDNA pool ob- 
tained in Example 1 as a template, reaction mixture was applied to agarose gel electro- 
phoresis. One PCR band that had a desired length (0.8 kb) was recovered from the agarose 
gel and purified by QlAquick (QIAGEN) according to the method by the manufacturer 
and then ligated to pMOSBlue-T-vector (Amersham). After transformation of competent 
E coli DH5a, 6 white colonies were selected and plasmids were isolated with Automatic 
DNA isolation system. As a result of sequencing, it was found that 3 clones had a sequence 
whose deduced amino acid sequence was similar to known acetyl^CoA carboxylase genes. 
These isolated cDNA clones were designated as pACClOU and used for further screening 
study. . 

Example 3: Isolation of genomic DNA friom P. rhodozyma 

To isolate a genomic DNA from P. rhodozyma, QIAGEN genomic! kit was used according 
to the method specified by the manufacturer. 
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At first, cells of P. rhodozyma ATCC96594 strain from 100 ml of .'overnight culture in YPD 
medium were harvested by centrifogation ( 1500 x g for 10.min.) ; and washed once with TE 
buffer (10 mM Tris / HCl (pH 8.0) containing 1 mM BDTA). After suspending in 8 ml of 
Yl buffer of the QIAGEN genomic kit, lyticase (SIGMA, St Louis, U.SA.) was added at 
5 the concentration of 2 mg/ml to disrupt cells by enzymatic degradation and the reaction 
mixture was incubated for 90 min at 30 d C and then proceeded to the next extraction step. 
Finally, 20 \ig of genomic DNA was obtained. 

Example 4: Southern blot hybridization by using pACC1014 as a probe 

Southern blot hybridization was performed to done a genomic fragment which contains 
10 ACC gene from P. rhodozyma. Two jxg of genomic DNA was digested by EcoRl and sub- 
jected to agarose gel electrophoresis followed by acidic and alkaliyie treatment. The de- 
natured DNA was transferred to nylon membrane (Hybond N+, Amersham) by using 
transblot (Joto Rika, Tokyo, Japan) for an hour* The DNA which was transferred to nylon 
membrane was fixed by a heat treatment (S0°C, 90 rain). A protfe was prepared by label- 
15 ing a template DNA (BcoRI and Sail -digested pACC1014) with DIG multipriming method 
(Boehringer Mannheim). Hybridization was performed with the method specified by the 
manufacturer. As a result, a hybridized band was visualized in the range from 2.0 to 2.3 
kilobases (kb). 

, •. 
Example 5: Cloning of a genomic fragment containing the ACC gene 

20 4 \lg of the genomic DNA were digested by EcoBl and subjected to agarose gel electro- 
phoresis. Then, DNAs with a length within the range from 1.5 to 23 kb was recovered by 
QIAEX II gel extraction kit (QIAGEN) according to the method specified by the manufac- 
turer. The purified DNA was ligated to 0.5 jig of UcoRI-digested and CLAP (calf intestine 
alkaline phosphatase)-treated XZAP II (Stratagene) at 16°C overnight, and packaged by 

25 Gigapack III gold packaging extract (Stratagene). The packaged extract was infected to J?. 
coli MRF strain and over-laid with NZY medium poured onto LB agar medium. .About 
5000 plaques were screened by using JScoRI and SaZI-digested pACG1014 as a probe. Five 
plaques were hybridized to the labeled probe. 

The in vivo excision protocol was applied to these XZAP II derivatives containing putative 
30 ACC gene from P. rhodozyma by following the instruction manual (Stratagene) to clone 
the insert fragment into JB. coli cloning.' vector, pBluescript SK. Each clone recovered from 
five positive plaques was subjected for sequencing analysis and it was found that the three 
of them had the identical sequence to the insert fragment of pACC1014. One of the clone 



wai named as pACC1224 and used for further study. As a result of whole sequencing of 
the entire region of insert fragment in pACC1224, it was suggested that this clone con- 
tained neither its 5'- nor 3'-end of the ACC gene. 

Example 6: Cloning of the flanking region of the insert fragment in pACCl224 from 
5 the genome of P. rliodozymet by genome walking method 

j 

Two PGR primers were synthesized based on the internal sequence of pACC!224 and used 
for the genome walking method: acc!7 (SEQ ID NO-6) and accl ji (SEQ ID NO;7). 
The protocol of the instruction manual prpvided from the supplier (Clontech) was 
folkjwedror the genome walking method. :In the PCR reaction using acc!7 primes a 2.8 
) kb PCR band emerged from the genomic Sttti library. In the casfe of acclS primer, a 2.2 kb 
PCR band was produced in the genomic IVhII library. These PCR bands were cloned into 
PCR2.1-TOPO (Invitrogen) and it was revealed that 2.8 kb PCR band contained a 5* frag- 
ment of ACC gene and 2.2 kb PCR band contained 3' fragment of ACC gene, respectively. 
The clones containing 2,8 kb and 2.2 kb P(Jr fragment were nanjed as pACCStul07 and 
pACCPvdl07, respectively and used for further study. ; 

Example 7: Southern Mot hybridization by using pACCStul07 and pACCPvdl07 as 
probes ? 

Southern blot hybridization.was performed to done a genomic fragment which covered 
the ACC gene from P. rhodozyma. 2 ug of genomic DNA was digested byj&aRl and sub- 
jected to agarose gel electrophoresis followed by acidic and alkaline treatment. The de- 
natured DNA was transferred to nylon membrane (Hybond N+, Amersham) by using 
transblot(JotoRfta, Tolc>ro; Japan) for an hour. The DNA which" was transferred to nylon 
membrane was fixed by a heat treatment (80'C, 90 minj. A probe was prepared by label- 
ing a template DNA (Ecoti -digested pACCStul07 and pACCPvdl07) with the DIG multi- 
priming method (Boehringer Mannheim). j Hybridization was performed with the method 
specified by the manufacturer. As a result, several hybridized baiids whose size was close 
to 2.0 kb, 0.9 kb and 0.6 kb were visualized jwhen the insert fragment in P ACCStul07 was 
used as a probe. In the case that the insert fragment in P ACCPvdl07 was used as a probe, 
a hybridized band was visualized in the range from 6.0 kb to 6.5 kb. 

Example 8: Cloning of the genomic done covering the ACC gene 

In a similar manner to Example 3, the genomic fragment containing the insert fragment in 
pACCStul07 and pACCPvd!07 was doned by plaque hybridization. 4 ug of the genomic 
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DNA was digested by EcaRl and subjected to agarose gel electrophoresis. Then, DNAs 
with a length within the following range were recovered by QIAEX II gel extraction kit 
(QIAGEN) according to the method specified by the manufacturer; (1) from 2.7 to 5.0 kb; 
(2) from 1.4 to 2.7 kbj and (3) from 0.5 to 1.4 kb. 

Each purified DNA was ligated to 0.5 (lg of .EtoRI-digested and CIAP (calf intestine alka- 
line phosphatase)-treated A,ZAP II (Stratagene) at 16 °C overnight, and packaged by Giga- 
pack III gold packaging extract (Stratagene). The packaged extract was infected to E coli 
MKF strain and over-laid with NZ5T medium poured onto LB agar medium. About 5000 
plaques were screened by using EcoRl '-digested pACCStul07 and pACCPvdl07 as probes. 
The following candidates were isolated after plaque hybridization study. 

1) 3 plaques from the 2.7 to 6.0 kb library by using the insert of pACCPvdl07 as a probe. 

2) 3 plaques from the 1 A to 2.7 kb library by using the insert of pACCStul07 as a probe. 

3) 21 plaques from the 0.5 to 1.4 kb libraryby using the insert of pACCStul07 as a probe. 
The in vivo excision protocol was applied to these XZAP II derivatives containing putative 
ACC gene from P. rhodozyma by following the instruction manual (Stratagene) to clone 
the insert fragment into E edit cloning vector, pBluescript SK. EAch clone recovered from 
the positive plaques was subjected for sequencing analysis. At least each clone had the pu- 
tative ACC gene from BLAST X analysis (http://www-blast.genome.adjp /V The following 
clones were selected and used for further analysis: ; 

pACCl 19-18 having a 6 kb insert and covering the 3* end of the ACC gene; 

pACC119-17-0.6 having a 0.6 kb insert flanking the 5 J end of the;pACC1224 insert frag- 

■i 

ment; 

pACC119-17-2 having a 2 kb insert flanking the 5' end of the pACCl 19-17-0.6 insert frag- 
ment; and j * 
pACC127-17-0.9 having a 0.9kb insert, flanking the 5* end of the pACC119-17-2 insert 
fragment i : 
As a result of whole sequencing of the entire region of insert fragihent in pACCl 19-18> 
pACCl 19-17-0.6, pACC119-17-2 and pACC127-17-0.9, it was suggested that these clones 
did not cover the 5* end of the ACC gene. . 

Example 9: Cloning of the franking; region of the insert fragment in pACC127- 17-0.9 
from the genome of P. rhodozyma by genome walking method 

PCR primer acc26 (SEQ ID NO:8) was synthesized based on the internal sequence of 
pACC127-17-0.9 and used for genome walking method. s 
In the PGR reaction using acc26 primer* a 2.6 kb PCR band emerged from the genomic 
Pvutt library. This PCR band was cloned into pCR2. 1-TOPO (InVitrogen) and it was 
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revealed that this clone contained 5' fragment of ACC gene as a result of BLAST X analyst 
This clone was named as P ACCPvul26 and used for further study. 

Example 10: Southern blot hybridizktion by using P ACCPvul26 asaprobe 

Southern blot hybridization was performed to clone a genomic fragment which covered 5' 
5 end of ACCgene from P. rhodozyma. In a similar manner as Example 7, Southern blot 
hybridization was performed. A probe was prepared by labeling^ template DNA {EcoKL - 
digested pACCPvull6) with DIG multipriming method (Boehringer Mannheim). Hybri- 
dization was performed with the method specified by the manufacturer. As a result, a 
hybridized band whose size was dose to 5.0 kb was visualized. 

) Example 11: Cloning of the genomic done covering 5' end of ACC gene 

In a similar manner to Example 8, the genomic fragment containing the insert fragment in 
pACCPvul26 was cloned by plaque hybridization. The genomic library covering 2 7 to 6 0 
kb in length prepared in Example 8 was also used, Twelve positive plaques which hybri- 
dized to the insert fragment of P ACCPvul26 labeled with DIG were isolated and subjected 
to in vivo excision to obtain plasmid DNA.] As a result of sequencing for thus isolated 
plasmids, most of the plasmids had the identical sequence to the insert fragment of 
pACCPvul26. One of the clones was named as pACC204 and used for further study. 

Example 12: Cloning of the gapped region between pACC204 and pACC127-17-0.9 

As a result of BLAST X analysis against known acetyl-CoA carboxylase genes succeeding to 
the sequencing study of V end of the insert fragment in pACC204 and 5' end of the insert 
fragment in pACCl27- 17-0.9, it was suggested that an approximately 0.3 kb fragment 
could bestiU missingfor a coverage of the entire ACC gene. The fallowing PGR primers 
were synthesized based on the internal sequence of P ACC204 and pACC127-17-0 9- acc43 
(sense primer) (SEQ ID NO:9) and ace44 (antisense primer) (SEQ ID NO10) 
After the PCR reaction of 25 cycles of 94°C for ISseconds, 55*C for 30 seconds and 72»C 
for 15 seconds by using HF polymerase (Clontech) as a DNA polymerase and a genomic 
DNA obtained in Example 3 as a template, the reaction mixture was applied to agarose gel 
electrophoresis. One PCRband that had a desiredlength (0.3 kb) was recovered from the 
agarose gel and purified by QlAquiclc (QIAGEN) according to the method by the manu- 
facturer and then cloned into pCR2.1-TOPO (Invitrogen). After transformation of com- 
petent R coli TOP10, 6 white colonies were. selected and plasmids were isolated with Auto- 
matic DNA isolation system. As a result of sequencing, it was found that 5 clones had an 

5 ' ♦ 
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identical sequence from each other. One of the isolated clones was designated as 
pACC210- • 

Example 13: Sequencing of a complete genomic fragment containing ACC gene 

pACC204; pACC210, pACC127- 17-0,9, pACC119-17-2, pACCl 19- 17-0.6, pACCl224 and 
5 pACC119-18 were sequenced with primer walking procedure by using AutoRead sequenc- 
ing kit (Pharmacia). 

As a result of sequencing, the nucleotide sequence comprising 10561 base pairs of the 
genomic fragment containing the ACC gene from P. rhodozyma containingits promoter 
(1445 base pairs) and terminator (1030 base pairs) was determined (SEQ ID NO:l). 

10 The coding region was 8086 base pairs long and consisted of 19 exons and 18 introns, In- 
trons were dispersed all through the coding region without 5' or 3' bias. It was found that 
an open reading frame (SEQ ID NO:2) consists of 2187 amino acids (SEQ ID NO:3) whose 
sequence is strikingly similar to the laiown;amino acid sequence of acetyl-CoA carboxylase 
from other species (56.28% identity to acetyl-CoA carboxylase from Emericella nidulans) 

15 as a result of homology search by GENETYX-SV/RC software (Software Development Co., 
Ltd., Tokyo, Japan). 

Fig. 1 depicts a cloned DNA fragment covering ACC gene region on the chromosome of P. 
rhodozyma. 

Example 14: Construction of antisense plasmid for ACC gene 

* * 

20 An antisense gene fragment which covers the entire structure gefle for ACC gene is ampli- 
fied by PCR and then cloned into an integration vector in which the antisense ACC gene is 
transcribed by its own ACC promoter in P. rhodozyma. 

The primers include an asymmetrical recognition sequence for the restriction enzyme, Sfil 
(GGCCNNNNNGGCC) but their asymmetrical hang-over sequence is designed to be 
25 different This enables a directional cloning into expression vector which has the same 
asymmetrical sequence at their ligation sequence. The use of such a construction is 
disclosed in BP 1,158,051. ! 

For the promoter and terminator fragment which can drive the transcription of the anti- 
sense ACC gene, the ACC promoter and terminator is cloned from the chromosome by 
30 using the sequence information listed in SEQ ID NO:l. The ACC terminator fragment is 
fused to a G418 resistant cassette by Ugating the DNA fragment containing the ACC 
terminator to a G418 resistant cassette of pG418Sa330 (EP 1,035,206) to an appropriate 
vector such as pBluescriptll KS- (Stratagene). i 
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Then, 3.1 kb of the Sacl fragment containing ribosomal DNA (rDNA) locus (Wery et al., 
Gene, 184, 89-97, 1997) is inserted downstream of the G418 cassette on thus prepared 
plasmid. The rDNA fragment exists hi multicopies on the chromosome of eukaryote. The 
integration event via the rDNA fragment would result in multicopied integration onto the 
5 chromosome of the host used and this enables the overexpression of foreign genes which 
are harbored in expression vector. ' 

Subsequently, ACC promoter is inserted in the upstream of ACC terminator to construct 
of expression vector which functions in P. rhodozyma. 

Finally, the antisense ACC construct is ; completed by inserting the 1.5kb of Sfil fragment 
contaimngantisenseACCintothuspreparedexpression vector fonctioning in i> rhodo- 
zyma. A similar plasmid construction is disclosed in EP 1,158,051. * ""■ 

Example 15: Transformation of P: rhodozyma with an ACC-antisense vector 

The ACC-antisense vector thus prepared is transformed into P. rhodozyma wild type strain, 
ATCC96594. The protocol for the biolistic;transformation is disclosed in EP 1,158,051. 

I , 

Example 16: Characterisation of antisense ACC recombinant of P. rhodozyma 

Antisense ACC recombinant of P. rhodozyma, ATCC96594 is cultured in 50 ml of YPD 
medium in 500 ml Erlenmeyer flask at 20*C for 3 days by using their seed culture which 
grows in 10 ml of YPD medium in test- tubes (21 mm in diameter) at 20°C for 3 days. For 
analysis of carotenoid produced appropriate volume of culture bi;bth is withdrawn and 
used for analysis of their growth, productivity of carotenoids, especially astaxanthin. For 
analysis of growth, optical density at 660 nm is measured by using a UV-1200 photometer 
(Shimacbu Corp., Kyoto, Japan) in addition to the determination^ of their dried cefl mass 
by drying up the cells derived from 1 ml of broth after microcentrifugation at 100"C for 
one day For the analysis of the content of astaxanthin and total carotenoids, cells are har- 
vested from 1.0 ml of broth after microcentrifogation and used for the extraction of the 
carotenoids frdmcellsofp. rWo^by disruption with glass beads. After extraction, 
disrupted cells are removed by centrifu'gation and the resultant is analyzed for carotenoid 
content with HPLC. The HPLC condition used is as follows: HPLC column: Chrompack 
Lichrosorb si-60 (4.6 mm, 250 mm), Temperature: room temperature, Eluent: acetone / 
hexane (18/82) add 1 ml/L of water to eluent, Injection volume: 10 |ti, Flow rate: 2.0 
ml/min, Detection: UV at 450 nm. A reference sample of astaxanthin can be obtained from 
Hoffinann La-Roche (Basel, Switzerland). ; 
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1. An isolated polynucleotide comprising a nucleic acid molecule one or more selected 
from the group consisting of: 

(a) nucleic acid molecules encoding at least the mature form of the polypeptide depicted in 
5 SEQIDNOt3; 

(b) nucleic acid molecules comprising the coding sequence as depicted in SEQ ID NO:2; 

(c) nucleic acid molecules whose nucleotide sequence is degenerate as a result of the 
genetic code to a nucleotide sequence of (a) or (b); 

(d) nucleic acid molecules encoding a polypeptide derived from the polypeptide encoded 
10 by a polynucleotide of (a) to (c) by way of substitution, deletion and/or addition of one or 

several amino acids of the amino acid sequence of the polypeptide encoded by a nucleotide 
of (a) to (c); 

(e) nudeic acid molecules encoding a polypeptide derived from the polypeptide whose 
sequence has an identity of 56.3 % or more to the amino acid sequence of the polypeptide 

15 encoded by a nucleic acid molecule of (a) or (b); 

(f) nucleic acid molecules comprising a fragment encoded by a nucleic acid molecule of 
any one of (a) to (e) and having acetyl-CoA carboxylase activity; : 

(g) nudeic acid molecules comprising a polynucleotide having a sequence of a nucleic acid 
molecule amplified from a Phaflia nucleic acid library using the primers depicted in SEQ 

20 ID NO:4, 5, and 6; 

(h) nudeic add molecules encoding a polypeptide having acetyl-CoA carboxylase activity, 
wherein said polypeptide is a fragment of a;polypeptide encoded by any one of (a) to (g); 

(i) nucleic add molecules comprising at least 15 nucleotides of a polynudeotide of any one 
of (a) to (d); 

25 (j) nucleic acid molecules encoding a polypeptide having acetyl-CoA carboxylase activity, 
wherdn said polypeptide is recognized by antibodies that have been raised against a 
polypeptide encoded by a nudeic acid molecule of any one of (a)' to (h); 
(k) nucleic acid molecules obtainable by screening an appropriate library under stringent 
conditions with a probe having the sequence of the nucleic acid molecule of any one of (a) 

30 to (j), and encoding a polypeptide having acetyl-CoA carboxylase activity; 

(I) nucleic add molecules whose complementary strand hybridizes under stringent 
conditions with a nucleic add molecule of any one of (a) to (k), and encoding a 
polypeptide having acetyl-CoA carboxylase activity, 

2. An isolated polynudeotide comprising a nudeic add molecule ; one or more selected 
35 from the group consisting of: 
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(m) nucleic acid molecules comprising thenudeotide sequence as depicted in SEQ ID 
NO:l; 

(n) nucleic acid molecules whose nucleotide sequence is degenerate as a result of the 
genetic code to a nucleotide sequence of (m)j 

(o) nucleic add molecules encoding a polypeptide derived from the polypeptide encoded 
by a polynucleotide of (m) or (n) by way of substitution, deletion and/or addition of one 
or several amino acids of the amino add sequence of the polypeptide encoded by a 
nucleotide of (m) or (n); ) 

(p) nucldc acid molecules encoding apolyjpeptide derived from the polypeptide whose 
sequence has an identity of 56.3 % or more to the amino add sequence of the polypeptide 

encoded by a nucleic add molecule of (m); ; •• — 

(q) nudeic acid molecules comprising a fragment encoded by a nudeic acid molecule of 
any one of (m) to (p) and having acetytCdA carboxylase activity^ 

(r) nudeic acid molecules comprisinga polynudeotide having a sequence of a nudeic acid 
molecule amplified from a Phaffia nucldc add library using the primers depicted in SEQ 
IDNO:4,5,and6j • 

(s) nudeic add molecules encoding a polypeptide having acetyl-CoA carboxylase activity, 
wherein said polypeptide is a fragment of a polypeptide encoded by any one of (m) to (r),' 
(t) nudeic add molecules comprising at least 15 nudeotides of a polynudeotide of any one 
of(m)to(o); 

(u) nucleic add molecules encoding a polypeptide having acetyl-CoA carboxylase activity, 
wherein said polypeptide is recognized by antibodies that have been raised againsta 
polypeptide encoded by a nucldc add molecule of any one of (m) to (s); 
(v) nudeic add molecules obtainable by screening an appropriate library under stringent 
conditions with a probe having the sequence of the nudeic acid molecule of any one of 
(m) to (u), and encoding a polypeptide having acetyl-CoA carboxylase activity; 
(w) nudeic acid molecules whose complementary strand hybridizes under stringent 
conditions with a nuddc add molecule of ^ny one of (m) to (v), .and encoding a 
polypeptide havmg acetyl-CoA carboxylase 5 activity. , 

3. The isolated polynucleotide of dauri 1 or 2, wherein said polynucleotide encodes amino 
add sequence which is identified by SEQ ID NO: 3 or has identity of 56.3 % or more with 
SEQ ID NO: 3. • 

4. The isolated polynudeotide of any one of claims 1 to 3, wherdn said polynudeotide is 
derived from a strain of P. rhodozyma Or Xanthophylomyces dendrorhous. 
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5. A method for making a recombinajit vector comprising inserting the polynucleotide of 
any one of claims 1 to 4 into a vector. 

6. A recombinant vector containing the polynucleotide of any one of claims 1 to 4 or 
produced by the method of claim 5. 

5 7. The vector of claim 6 in which the polynucleotide of any one of claims 1 to 4 is 

operatively linked to expression control sequences allowing expression in prokaryotic or 
eukaryotic cells, ; 

8. A method of making a recombinant organism comprising introducing the vector of 
claim 6 or 7 into a host organism* 

10 9. The method of claim 8, wherein said host organism is selected from R coli, baculovirus, 
or S. cerevisiae. 

10- The recombinant organism containing the vector of claim 6 or 7, or produced by the 
methodofdaim8or9. r 

11. A process for producing a polypeptide having acetyl-CoA carboxylase activity 
15 comprising culturing the recombinant organism of claim 10 and recovering the 

polypeptide from the culture of said recombinant organism. 

12. A polypeptide obtainable by the process of claim 11. - -> 

13. An antibody that binds specifically to the polypeptide of claim 12. 

14. An antisense polynucleotide against the polynucleotide of any one of claims 1 to 4. 

20 15. A method for making a recombinant vector comprising inserting the polynucleotide of 
claim 14 into a vector. 

16. A recombinant vector containing the polynucleotide of claim- 14 or produced by the 
method of claim 15. 

17. The vector of claim 16 in which the polynucleotide of claim 14 is operatively linked to 
25 expression control sequences allowing expression in prokaryotic or eukaryotic cells. 

18. A method of making a recombinant organism comprising introducing.the vector of 
claim 16 or 17 into a host organism. • 



L$:Q^^aS'LZ \ jazssue jdi»3 



; -40- 

19. The method of claim 18, wherein Said host organism is belongs to a strain of Phaffia 
rhodozyma or Xanthophylomyees dendforhous. 

20. The recombinant organism containmgthe vector of claim 16 or 17, or produced by the 
method of claim 18 or 19. 

i 21. The recombinant organism of claim 20, wherein said organism is characterized in that 
whose gene expression of acetyl-CoA carboxylase is reduced conipared to the host 
organism, thereby is capable of producing carotenoids in an enhanced level relative to a 
host organism. i 

22. The recombinant organism according to claim 21, wherein the gene expression of 
acetyl-CoA carboxylase is reduced by means of the technique selected from antisense 
technology, site-directed mutagenesis; error prone PCR, or chemical mutagenesis. 

23. A process for producing carotenoids, which comprises cultivating the recombinant 
organism of claim 21. 

24. The process of claim 23, wherein said carotenoids are selected one or more from 
astaxanthin, 0-carotene, lycopene, zeaxanthin, canthaxanthin. \ 

25. The process according to claim 23, wherein the gene expression of acetyl-CoA 
carboxylase is reduced in the recombinant organism of claim 21 by means of the technique 
selected from antisense technology, site-directed mutagenesis, error prone PCR, or 
chemical mutagenesis. 

26. A process for the production of a carotenoid by culturing a microorganism under suit- 
able conditions and, optionally, recovering the resulting carotenoid, wherein the micro- 
organism is characterized in that its gene expression of acetyl-CoA carboxylase is reduced, 
e.g. by means of the technique selected from antisense technology, site-directed muta- 
genesis, error prone PCR, or chemical mutagenesis. 
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SEQUENCE liXS*TNG<110> 

Roche Vitamins AG 
<120> ACC gene 
<130> NDR5217 

<170> patentin version 3.1 



10 



15 



20 



25 



30 



35 



40 



<210> 
*211;> 
<212> 
<213> 
<220> 
<221> 
<222> 
<223> 
<220> 
<221> 
<222> 
<223> 
<220> 
<221t> 
<222> 
<223> 
<220> 
<221> 
<222> 
<223> 
<220> 
<221:> 
<222> 
<223> 
<220> 
<221> 
<222> 
*223V 
<220> 
<221> 
<222> 
<223> 
<220> 
<221> 
<222> 
<223> 



1 

10561 

DMA 

Phaffia rhodofcyma 
5'UTR 

(1221).. (1222) 



polyA^site 
(5813) . . (9814) 



exon 

(1446) > . (1482) 



exon 

(9231) . . (9530) 



exon 

(7296) • . (9160) 



exon 

(7048) . . (7227) 



exon 

(6899) -.(6976) 



exon 

(5871) . . (6832) 



<220> 

<221> exon 
<222> (5674) . . (5805) 
<223> 
5 <220> 

*221> exon 

<222> (5456) . . (5608) 

<223> 

10 <221> exon 

<222> (4984) . i (5384) 

<:223> 

<220> 

<2%1> exon 
15 <222> (4096) . . (4911) 
<223> 
<220;> 

<221> exon 
<222> (3828) .. (4026) 
20 <223> 
<220> 

<221> exon 

<222> (3075) ..(3443) 

<223> 
25 <220> 

<221?> exon - 

<222> (3518) . . (35S2) 

<223> 

<220> 
30 <22i> exon 

<222> (1676) ..(1758) 

<223> 

<220> 

<221> exon 
35 <222> (1833) .,(1957) 
<223> 
<220> 

<221> exon 
<222> (2031) (2171) 
40 <223> 
<220> 

<221> exon 

<222> (2244) (2641) 

<223> 



<220> 
<221> 
<222> 
<223> 
5 <220> 
<221> 
<:222:> 
<223> 
<220> 

10 <221> 
<222> 
<223> 
<220> 
<221> 

15 *222> 
<223> 
<220> 
<221> 
<222> 

20 <223> 
<220> 
<22X> 
<222> 
<223> 

25 <220> 
<221> 
*222> 
<223> 
<220> 

30 <221> 
<222> 
<223> 
<220> 
<221> 

35 <222> 
<223> 
<220> 
<221> 
<222* 

40 <223> 
<220> 
<22l> 
<222> 
<223> 



exon 

(2746) . . (2991) 



exon 

(3626).- (3750) 



Incron 

(1483) - . (1675) 



Intron 

(4912) . . (4963) 



intron 

(375X) . - (3827) 



Intron 

(5385) (5455) 



Intron 

(6977) . . (7047) 



Intron 

(3553) . - (3625) 



intron 

(5806) * . (5870) 



intron 

(9161) - . (9230) 



Intron 

(3444) (3517) 



+ 
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<220> 

<221> Intron 
<222> (2992) . 1 (3074) 
<223> 
5 *22Q> 

<221> Intron 
«222> (7228) . . (7295) 
<223> 
<220> 
10 <22l> intron 

<222> (6833) . . (6898) 

<223> 

<220> 

<221> Intron 
15 <222> (5609) .. (5673) 
<223> 
<220> 

<221>. Intron 
<222> (4027) (4095) 
20 <223> 
<220> 

<221> Intron 

<222> (2642) . (2745) 

<223> 
25 <220> 

<221> Intron 

<222> (2172) . . (2243) 

<223> 

<220> 
30 <:221> Intron 

<222> (1958) . . (2030) 

<223> 

<220x221> Intron 
<222> (1759) , . (1832) 
35 <223> 



<400> 1 * 

caacagacag acaaaggaac ctacgtgta'c atactggtct ttccaacgtc gcggcgtcga 



60 



40 gattaactag aacaacactc gacaatcgaa tctcttattc cgccctagtt gaaggcgtct 120 
gttcaaatcg atcaagatct tccaatcat'c gacatccagg tattcgcact cgactctgcc 180 

* * : 

• : 1 

cgcacgtacc gccecgattt tcttatggcc adcagatctc aactctgata tacattggtc 240 
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*. 

caccctgtct ttgtctcttt gcctttcgtt cdatctagcg ctgttcaacg gatcacccag 300 

tcggcttgac tcaactccct ctggaacgtg tgccttatct caggttctga tttctcctca 360 

gccagtatgc gcacaaagca gcgaccgcga ctttttgctc eataagaect cteagcgggg 420 

2 

aatatatgac actcatacat cgatagctcg tatgttttcfc tfcgatcactt cctaaaatgt 480 

10 aacggcaact gacattcaac atgafcgcgct ttcatagatc aactacttcc gactacgatg 540 

i • 1 

accgtccttc tatacagecc agtcagctng tcgacctcac ataaagtgac tgagaccgcg 600 

; \ 

atctcgaaca tcttattcct tccaccgtta gctgagaagt ggattacacc atcaatagaa 660 

15 ' * 

tcatctaccc cgfctcttgcc tggactaatg cgfccaggagc tcttggatka aggagaaafca 720 

gctgagcaga ccatcacctt ggacgatgtc cgtctgtggc tgaactccgg aggtcgagfcg 760 

20 gcgtgctgca acgcacttcg aggaattt^rg gaagtgaacc tcgtttggag tgataaatga 640 

j i 

gattacgaaa gtctgttcga aacatccatg cttcafcgata accgafcaapg cttaaatcfct 900 

gagagtgcgc acatcgatcg ccttttatafc acggggttgg ggaaacatjaa agtgttcata 960 

25 ; * 

gactattgtt catatatctt aaagtacaaa gacgcatcta accctaagbe tgaatgattg 1020 

gcaaaatcct agtaagaccg ngaaatccqg aagaatacgc agttcattaa taaagatata 1080 

1 • • l 

30 gcttaggtaa gcagcggttg ctcccccaMc caacctcatc cgaaattccc cagggggttg 1140 

*. » 

agattctcaa ggcfcttgaat ccccatcccg tcaagttggt cttaaaccct tcatctctac 1200 

j • 
| i 

ttgttacttc ttttcfctcct gacctccttc cdccactcce tcctattctc tgaacgaact 1260 

35 * : ; i 

cgcctccctg fcccafcctact cttcttcggt tttcttttgg gcttttactc cccccgctcc 1320 

* : 

tcctceatct ttccatctct tttcgtatct gtgggtaact ttgcatccaa gggecctcac 1380 

40 acataaccct atatccatct tcctecattc aciaeacatet gtactcaacc aacaaagctc 1440 

acaag atg gtt gtc gat cac gag age gta agg cat fcfcc ate g 1482 
Met Val Val Asp His Glu Ser tfal Arg His Phe life 
1 3 * 3.0 J 



# 
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gtaagcgttc ttgtrctttt ccttgtctgg ctccctgcat tttcttaaac gatcjfcaggaa 

gagagggaaa ttacatctgg tcaattttcc gcgctctttt ccttggggac aaaagaatgc 

5 j ] 

ctttctgtga tcggagatcg gttgctgatc ccttttgtct tgctcctttt gctctttccc 

• » 

teccctttae cag gfc gga aac gca cttjgag aac gcc cct ccg tea age 
Gly Gly Asn Ala Leu f Glu Asn Ala Pro Pro Ser Ser 
10 15 20 ) 

gcc acc gat ttc gtt aga agt caa gat ggt cac acg gtc |atc acc aaa 
val Thr Asp Phe val Arg Ser Glxi Asp Gly His Thr Val £le Thr Lys 
25 30 i 35 40 



15 



gtcagtaatt ttcatttttt ccttcacgta gcctcagggc caaggagcta aattgettet 



gtatcatttc tcag gtc etc att gcc? aac aac gga acc get fret gta aaa 
Val Leu lie Ala Asn Asn Gly lie Ala ;Ma Val Lys 

2 ° 45 1 ;50 

' i 

gag ate cga tea gtt cgt aaa egg get tac gag acg ttt gga gat gag 
Glu He Arg Ser Val Arg Lys Trp Ala Tyr Glu Thr Phe Gly Asp Glu 
55 60 : ^5 i 



25 



cga gcc ate gaa ttt acg gta atg gcc act cca gaa gat fct 
Arg Ala He Glu Phe Thr val Met Ala Thr Pro Glu Asp pen 
70 75 1 80 \ 



30 gttegtacca atcacataag ctttccttga gtcagggaca tcctctaafct aattcaactt 

: * I 
gagegecata cag g aag gtg aac tgc gac tat att cga atg! get gat cga 
Lys Val Asn Cys Asp Tyr He Arg Met' Ala Asp Arg 

.. 85 ; 90 .■ 

35 gtc gtc gaa gtt cct gga gga act aac aac aac aat cac tct aac gtc 
Val Val Glu Val Pro Gly Gly Thr Asri Asn Asn Asn His Ser Asn Val 
* 5 100 ! ; 105 HO 

gac etc ate gtt gac ate gcc gag cga ttc aat ata cac get gtc egg 
40 Asp Leu He Val Asp He Ala Glu Arg Phe Asn He His Ala Val Trp 

115 • 120 125 

get gga tg gtaagtaaaa taggacctta acatgttgga agaagagfegt 
Ala Gly Trp ; 



1542 
1602 
1662 
1710 



1758 



1818 
1868 



1916 



1957 



2017 
2067 

2115 



2163 



2211 



-47 



ceacttaaac gcgctttctt tccatccgac ag g ggt cac get teg gaa aac ccc 

Gly His Ala Ser Glu Asn Pro 
130 135 



2265 



aga ctt ccc gag fcet etc gec gec tea aag aac aag ate ;gtc ttc att 
Arg Leu Pro Glu Ser Leu Ala Ala Ser Lys Asn Lys He ;Val Phe He 
140 : 145 "150 



2313 



ggt cct ccc gga tec get atg cgi tee ctt gga gac aag *att tct teg 
Gly Pro Pro Gly Ser Ala Met Arg Ser Leu Gly Asp Lys .lie Ser Ser 
155 160 165 v 



2361 



acc ate get gee eag tct gee cag gtg ccg tgt atg gec [tgg tct gga 
Thr He Val Ala Gin Ser Ala Gin Vai Pro Cys Met Ala Trp Ser Gly 

170 175 I ] 180 t 



2409 



tea- gge ate act gat aca gag ct£ agd cct cag 
Ser Gly He Thr Asp Thr Glu Leu Se£ Pro Gin 
185 190 j 195 



ggc ttc gtg act gtg 
Gly Phe Val Thr Val 

I 200 



2457 



ccc gat ggg cca tat cag get get tgt gta aag 
Pro Asp Gly Pro Tyr Gin Ala Ale Cys Val Lys 
205 \ 210 



acg gtg gag gat ggt 
Thr Val Glu Asp Gly 
215 



2505 



ttg gtg cga gee gag aag ate ggl* ttg cca gtt 
Leu Val Arg Ala Glu Lys He Gly Leii Pro Val 
220 225 



atg ate aag gee tct 
Met He Lys Ala Ser 
•230 



2553 



gag gga gga gga gga aag ggt ate cga atg gtt 
Glu Gly Gly Gly Gly Lys Gly He Ar^ Met Val 

235 240' 
ttc aag aac tec tac aac tec gte get tec gag 
Phe Lys Asn Ser Tyr Asn Ser Valj Ala Ser Glu 
250 . 255 j 



cac age htg gac aca 2601 
His Ser Met Asp Thr 
245 • 

gtg cca g gtaagttcac 2651 

Val Pro : 

260 



tctgtttgac tggagatttg ageacaafcet et^accatggg agttcaagaa ggaataccca 2711 



ctcatgaatt gaegactgeg ttettgacqt ctfag ga tct ccg att ttc ate atg 

\ Gly Ser Pro lie Phe He Met 



265 

■: 



2765 



gec ttg get gga tct get cga cat ttg gag gtc cag etc ctt get gat 



2813 



-48- 

Ala Leu Ala Gly Ser Ala Arg His' Leu Qlu Val Gin Leu Leu Ala Asp 
270 275 : - 280 J 



cag tac gga aac get ate tct ttg ttc ggt cga gat tgc pet 0tt cag 
5 Gin Tyr Gly Asn Ala He Ser Leu Phe] Gly Arg Asp Cys Ser Val Gin 
205 290 ; 295 300 



2861 



cga cga cat cag aag ate ate gag gag get ccc gtc acg &tc get cgt 
Arg Arg His Gin Lys He He Glu. Glu Ala Pro Val Thr lie Ala Arg 
10 305 \ 310 \ 315 



2909 



15 



cca gag aga ttc gaa gag atg gag aag get get gtc agg ptg gec aag 2957 
Pro Glu Arg Phe Glu Glu Met Glu Lys Ala Ala Val Arg £eu Ala Lys 
320 325 330 

i . 1 

tta gta gga tat gtt agt gee ggt aeo gtc gaa t gtaagg&aca 3001 
Leu Val Gly Tyr Val Ser Ala' Gly Thr Val Glu 
335 340 1 



aacagctacc tctcattetg ttttttcgag atzagtcaact tacatcactt ttettttgee 3061 

ggattttctt tag ac etc tac tct cac ;gcc gac gac tea ttc ttc ttc 3109 
Tyr Leu Tyr Ser : His ; Ala Asp Asp Ser Phe Phe Phe 
345 ? 350 ) 355 



etc gaa etc 
Leu Glu Leu 



aac cct cga ctt caa gtc gag cac cct act acc gag .atg 
Asn Pro Arg Leu Glri Val Glu His Pro Thr Thr Glu Met 
360 : 365 \ 370 



3157 



gtc teg ggt 
Val Ser Gly 

ate cct ctt 
He Pro Leu 
390 



gtc aac ctt ccc get get cag ctt cag att get aug ggt 
Val Asn Leu Pro Ala Ala Gin Leu Gin He Ala Met Gly 
375 380 385 

tct cga att egg gat att cga gtc etc tac ggt etc gat 
Ser Arg He Arg Asp lie Arg Val Leu Tyr Gly Leu Asp 
395 400 ■ 



3205 



3253 



ccc cac act gtt tec gag ate gac ttc gac age age aga gcg gag tct 
Pro His Thr Val Ser Glu lie Asp Phe Asp Ser Ser Arg Ala Glu Ser 
405 • 410 1 415 



3301 



gec cag act cag agg aag cct agg ecq aag ggt cac gec att gec tgt 
Val Gin Thr Gin Arg Lys Pro Arg Pro Lys Gly His Val lie Ala Cys 



420 



425 



430 



435 



3349 



49 



cga ate acg age gaa aac ccc gat gag ggg ttc aag ccg tct gec gga 
Arg He Thr Ser Glu Abii Pro Asp Glu Gly Phe Lys Pro Ser Ala Gly 



440 



445 



450 



339? 



5 gat ace caa gag teg aac ttc aga agt aat act aac gtc tgg gga t 
Asp. lie Gin Glu Leu Asn Phe Arg Ser Asn Thr Asn val Trp Gly 
455 460 465 



3443 



10 



gtgagtacag aggcttctca aagattctta tgcggaacaa atctctgact cttaaattgt 3503 

* 

; ■ » 
gtttgacttt caag ac ttc tct gtt gga get act gga gga att cat agt. 3552 
Tyr Phe Ser Val Gly Ala Thr Gly Gly He His Ser 
470 475 * 



15 gtaagtttct tcgccaacaa tataatcaca cfcagatccct atetaatefcg aactggctta 

i 

tctcttgtta tag ttc gec gat tct caattc ggt eac gtg tct get tat 

Phe Ala Asp Ser .Gin : Phe Gly His Val Phe Ala Tyr 

480 485 ' 490 



20 



ggc tec gac cga acg act gee aga aag aat atg gtt ate poc ttg aaa 
Gly Ser Asp Arg Thr Thr Ala Arg Lys Asn Met Val He Ala Leu Xys 
495 1 500 ; 505 



3612 
3661 

3709 



25 gag ctt tec att cga gga gac etc cga acc acc gtc gag ta 
Glu Leu Ser He Arg Gly Asp Phe Arg Thr Thr Val Glu ^cyr 
510 .515 



3750 



gtgegtatag cctggtacat ctcctttcaa tcacttacga tgaactgape gatctgtctc 

30 ? ; 



3810 



35 



gatcacgttt aatctag t ctt ate act ctt ctt gag acg age* gat ttc gag 3861 

Leu lie Thr Leu Leu Glu Thr Ser.. Asp Phe Glu 
: 525 i 530 

; i 

cag aac gec att acc acc get tgg ttg gat ggg ttg ate act aac aag 3909 
Gin Asn Ala He Thr Thr Ala Trp Leu Asp Qly Leu He Thr Asn Lys 



535 



540* 



545 



40 ctt aca tct gag agg cct gat cca tea ctg gec gtt att tgt ggt gca 
Leu'Thr Ser Glu Arg Pro Asp Pro, Ser Leu Ala Val He Cys Gly Ala 
550 555. •! 560 5 



3957 



1-50- 

ate gtg aaa get cac gtg get tot gag aac tgt tgg gec gaa fcac cga 
lie Val Lys Ala His Val Ala Sett Giu Ajm Cys Trp Ala Glu Tyr . Arg 



565 



570 



575 



4005 



cga gca teg gac aag gga cag gtaagcfectg tttctcatga agtttttgac 
Arg Val Leu Asp Lys Gly Gin '* [ 

580. 585 - ; 



4056 



tgaggcacte accactccgt acatgtttcc tgtttttag gtt ccc tec aag gac 4110 
10 Val Pro Ser Lys Asp 

■ ; 590 

. i i 

act etc aag aca gtg ttc act ctt gat: tec ate tat gag ^gt gtt egg 4158 

Thr Leu Lys Thr Val Phe Thr Leu As£ Phe He Tyr Glu Gly Val Arg 
15 595 600 feOS 



20 



25 



35 



tac aat ttc 
Tyr Asn Phe 
610 

eta aac gga 
Leu Asn Gly 
625 

gga atg etc 
Gly Met Leu 
640 



ace get get 
Thr Ala Ala 



gga aag acc 
Gly. Lys Thr 



gag gaa gtc 
30 Glu Glu Val 

att gag cag 
He Glu Gin 



gtt ctt etc 
val Leu Leu 
645 

ggt acc etc 
Gly Thr Leu 

660 
gag aac gac 
Glu Asn Asp 
675 



cga gee 
Arg Ala 
615 

i 

gtg gtg 
val val 
630 

gat ggc 
Asp Gly 



teg etc aac 
Ser Leu Asn 



cga att 
Arg He 

ccc act 
Pre Thr 



act tac pga ttg tat 
Thr Tyr Arg Leu Tyr 

620 j 

\ 

cct ttg gee gat ggt 
Pre Leu Ala Asp Gly 
635 5 

act etc tac tgg agg 
Thr Leu Tyr Trp Arg 

: 655 

; j 

cag gta gac gca aag act tgc ctg 
Gin Val Asp Ala Lys thr Cys Leu 

i 670 
tea ccc ccg cet gga 
Ser Pro Ser Pro Gly 
$85 



4206 



tec ate cga 
Ser He Arg 

j 

cga tee cac 
Arg Ser Hie 
' 650 



' 665 
cag etc cga 
Gin Leu Arg 
680 



aag ate ate egg ttt ttg gtc gaa age gga gat cac ate tec tec gga 
Lys He He Arg Phe Leu Val Glu Ser Gly Asp His He Ser Ser Gly 
690 695 700 : 



4254 



4302 



4350 



4398 



4446 



40 gat ate tat get gag gtc gag gtc atg aag atg ate ttg ccc ttg att 
Asp He Tyr Ala Glu Val Glu Val Met Lys Met He Leu Pro Leu He 
705 710 715 \ 



4494 



10 



30 



35 



£6:01 -m-LZ 



1-51- 

gcc cag gag tec ggt cac gtt'cag ttt gtc aag caa gec jggfc gcg ace ' 4542 

Ala Gin Glu Ser Gly His Val Gin Pne val Lya Gin Ala ;Gly Val THr 

720 725 730 735 

• j 

i 

gtc gat cct gga gcg att att ggg ate ttg agt etc gat >gac cct acg, 4590 

val Asp Pro Gly Ala lie lie Gly lie Leu Ser Leu Asp jAsp Pro Thr 
740 ; 745 J 750 

| i 

cga gtg aag aag gcg aag ccc etc gag ggt etc ctg cct ;gtg act ggt 4638 
Arg val Lys Lys Ala Lys Pro Phe Glu Gly Leu Leu Pro !Val Thr . Gly 
755 760 ^765 



etc cct aac ctg ccc ggt aac aga cct cac cag egg eta ;cag ttc cag 
Leu Pro Asn Leu Pro Gly Asn Arg Pro His Gin Arg Leu >Gln Phe- Gin 
15 770 775 780 ] 

! ! * 

ctt gag teg ata tac teg gtc ttg gat gga tac gag agt :gac tec act 
Leu Glu Ser lie Tyr Ser Val Leu Asp Gly Tyr Glu Ser ;Asp Ser Thr 
785 790 795 \ 

; 

20 ; : 

gca aca ate etc cga tea ttc fcet gaa aac ctt tat gat ".cct gat ctt 
Ala Thr ile Leu Arg Ser Phe Ser Glu Asn Leu Tyr Asp jPro Asp Leu 
800 805 810 ) 815 

25 get ttc gga gag get tta tee ate att tec gtc ctt tct -ggg aga atg 
Ala Phe Gly Glu Ala Leu ser ile He Ser val Leu Ser :Gly Arg Met 
820 * * 825 I 830 



4686 



4734 



4782 



4830 



4878 



cct gec gat ctt gag gag age att cga gag gtc ate age *gaa get cag 
Pro Ala Asp Leu Glu Glu Ser Ile Arg Glu val lie Ser &lu Ala Gin 

835 840 '845 

teg aag cct cac gee gag ttc ccc gga tea aag gtgtgtagtt gategcagag 4931 
Ser Lys Pro His Ala Glu Phe Pro Gly Ser Lys j 
850 855 

.* 

ttatgactgt atacatcgac cagaagctta cecatctctt tcgtgtgCac ag ate etc 4989 

: • He Leu 

s 

i 860 

40 aaa gtc gtc gag egg tac ate gat aat ttg cga cct cag :gag agg get 5037 
Lys Val Val Glu Arg Tyr lie Asp Asji Leu Arg Pro Gin "Glu Arg Ala 
865 % 870 875 



m 



r 52 - 

acg gtc cga act cag ate gaa ccc ate get ggt att get gag aag aac 
Met Val Arg Thr Gin. lie Glu Pro- He Val Gly He Ala Glu Lys Aen 
880 , 88S 890 

gtt ggc ggt cct aag ggt tac gec tct tac gtc tta get acc ate etc 
val Sly Gly Pro Lys Gly Tyr Ala Ser Tyr Val Leu Ala Thr lie Leu 
895 9QQ 9 05 { 



5085 



5133 



20 



25 



30 



35 



caa aag ttc ctg gec gtt gag gcq gtt ctt get act ggt agt gaa gag 
10 Gin Lys Phe Leu Ala Val Glu Ala Val Phe Ala Thr Gly Ser Glu Glu 
910 ' 915 "* 920 '. 

# • * 

gec att gtt' etc caa ctt cga gat gaa aac cga gaa tot !ttg aac gac 
Ala He val Leu Gin Leu Arg Asp Glu Asn Arg Glu Ser Leu Asn Asp 
13 925 : »M . i 935 I 940 

gtc ctt ggt etc gtc ctg get cag tcfir cgt etc age get Sega tec aag 
Val Leu Gly Leu Val Leu Ala His seir Arg Leu Ser Ala Arg Ser Lys 
3*5 . i 950 955 

ctt gtt etc tec gtc ttt gat ctg ate aag tct atg cag etc etc aac 
Leu Val Leu Ser val Phe Asp Leu Il ? Lys Ser Met Gin Leu Leu Asn 

960 : 965 g 70 

i ! 

aac act gag ggt tet ttc ctt cat aag act atg aaa gcg Jctt gec gac 
Asn Thr Glu Gly Ser Phe Leu His Lys Thr Met Lys Ala Leu Ala Asp 
975 9fl 0 985 : 

! i 

atg. ccc acc aa gtaggtttcc tcttgtagte tacaaactat tgttgcgatg 
Met Pro Thr Lya 

990 : 
tgttgacaaa gactccgttt ccgatetata g!g get cct Ctg gee age aag gtg 

. ) Ala Pro Leu Ala ser Lys val 
' : 995 i 

tct ttg aag get egg gaa att ctt ate tct tgc tct c'tt ccc tct 
Ser Leu Lya Ala Arg Glu lie Leu ile Ser Cys Ser Leu Pro Ser ' 
iooo 1005 mo 



5181 



5229 



5277 



5325 



5373 



5424 



5477 



5522 



40 tac 



gag gag agg ttg ttc cag atg &aa aag aCc ctt ^ t£Jt tefc 
Tyr Glu Glu Arg Leu Phe Gin Met Glu Lys lie Leu Asn Ser Ser 
1015 1020 ; 102S 



5567 



.53- : 

gtc acc ace tct toe tac gga sag act gga ggt gga c?iC ag 

val Thr Thr Ser Tyr Tyr Gly Glu tfhr Gly Gly Gly His Arg 
1030 1035 1040 



5608 



5 gtttgtcctc fceeeatgtgt ttctagttca tagetctctg ctgactcfcga tcegattttc 



10 



aacag a aac ect teg gtt gat gt;t ctg 
Asn Pro Ser Val Asp Val Leu 
1045 ! iqso 

cga ttc acc gtc tac gat gtc ctg tec 
Arg- Phe Thr Val Tyr Asp Val Leu Ser 
1060 1Q65 



act gag ate tea aac tct 
Thr Glu He Ser Asn Ser 
1055 

tec ttc ttc aa£ cac gat 
Ser Phe Phe Lys His Asp 

: 1070 



5669 
5713 



5758 



15 gat cct tgg att gtt ctt get agt ttjg acc gtc tac gtt ctt cga 
Asp Pro Trp He Val Leu Ala Ser Leu Thr Val Tyr Val Leu Arg 
1075 1080 1085 



5803 



20 



gc gtaagtgatc gttcttctcc tcttgccc&a acaatgactg acagttctat 

Ala r < 



5855 



25 



30 



35 



ctattccatc tgcag t tac cga gag tac agt att ctt gat atg caa cat 

Tyr Arg Glu Ty£ Ser He Leu Asp \ Met Gin His 

3 » 



1090 



1095* 



gag caa ggt cag 
Glu Gin. Gly Gin 
1100 . 



aag etc aac cag 
Lys Leu Asn Gin 
1115 

teg aat cga gac 
Ser Asn Arg Asp 
1130 



gat ggc get get gga gtc ate act tgg cga ttc 
Asp Gly Ala Ala Gly Val He Thr Trp Arg Phe 
1105 : * 1110 : 



ccc ate get gag tct tct act ccc 
Pro He Ala Glu Ser Ser Thr Pro 
1120 I 1125 

gtt tac cga gtc ggt teg ctt tct 
Val Tyr Arg val Gly Ser Leu Ser 
1135 j 1140 



cga gtt gac 
Arg Val Asp 

f 
i 

gat ttg acc 
Asp Leu Thr 



5904 



5949 



5994 



6039 



40 tac aag ate aag 
Tyr Lys He Lys 
1145 



cag agt cag acc gag ccc etc cga get ggt gtc 
Gin Ser Gin Thr Glu Pro Leu Arg Ala Gly Val 
1150 : ! 1155 



6084 



Z&:0^ <ie$'- ijezssuejdiug 



54 



atg acg age ttc aac aac teg 
Met Thr Ser Phe Asn Asn Leu 
1160 1165 



aag gag gtt cag gac gga etc ttg 
Lys Glu Val Gin Asp Gly Leu Leu 
! 1170 1 



6129 



aat gtt ctg tct ttc etc cct get tac cat cat caa gkfc ttc act 
Asn val Leu Ser Phe Phe Pro Ala Tyr His His Gin Asp Phe Thr 
117 5 HBO ; lies ! 



6174 



caa cga .cat ggt cag gac agt 
10 Gin Arg His Gly Gin Asp Ser 
1190 H95 



get ate egg get ttc gag gag 
Ala lie 'Arg Ala Phe Glu Glu 
1205 1210 



15 



20 



Srcc atg ccc aac gtt 
Ala Met Pro Asn Val 
1200 



etc aac att 
Leu Asn lie 



6219 



aag gac gac atg tct 
Lys Asp Asp Met Ser 
! 1215 



tgg gec aag agt gtc gag teg 
Trp Ala Lys Ser Val Glu' Ser 
1220 1225 

ate cag aag aag gga att cga 
lie Gin Lys Lys Gly lie Arg 
1235 1240 



ctg gta atg cag atg 
lieu val Met Gin Met 
* 1230 

cga gtt acc ttc ttg 
Arg val Thr Phe Leu 
; 1245 



gat ctt gat 
Asp Leu Asp 



tct gec gag 
Ser Ala Glu 



gtt tgc cga 
val Cys Arg 



25 aag ggc gtt tac ccc tec tac ttc acc ttc aga caa 
Lys Gly Val Tyr Pro Ser Tyr Phe Thr Phe Arg Gin 
"50 1255 ] 1260 



sag ggt gec 
Glu Gly Ala 



6264 



6309 



6354 



6399 



30 



35 



cag ggc 
Gin Gly 

1265 
get eta 
Ala Leu 

1280 

gtc acc 
Val Thr 
1295 



ccc tgg aga gag gag gag aag att cga aac 
Pro Trp Arg Glu Glu Glu Lys He Arg Asn 

1270 : ! 1275 
gec agt cag ctt gag etc aac cga etc teg 
Ala Ser Gin Leu Glu Leu Asn Arg Leu Ser 

1285 : 1290 

cct ate ttc gta gac aac aga cag ate cac 
Pro He Phe Val Asp Asn Arg Gin He His 
1300 : 1305 



ate gag cct 6444 
He Glu Pro 

aat ttc aag 6489 
Asn Phe Lys 



a;tc tac aag 6534 
lie Tyr Lys 



40 gga gtg 
Gly Val 



aag gag aac tct tec gat gtt cga ttc ttt ate egg 
Gly Lys Glu Asn Ser Ser Asp val Arg Phe Phe He Arg 
1310 1315 : - 1320 : 



6579 
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55- 



gCt ttg gtt cga cct gga egg gtc cag gga teg atg aag get gee 
Ala Leu Val Arg Pro Gly Arg Val Gin Gly Ser Met Lys Ala Ala 
1325 1330 1335 



6624 



5 gag tat etc ate tec gag tge gat cga ctg etc act gat ate ctg 
Glu Tyr Leu He Ser Glu Cys Asp Arg Leu Leu Thr Asp He Leu 
1340 1345 1350 



6669 



gac gec 
10 Asp Ala 
1355 



ttg gag gtt gtt gga gec gag act cga aac gec gat tgc 
Leu Glu Val Val Gly Ala Glu Thr Arg Asn Ala Asp Cys 
1360 1365 



6714 



15 



aac cat gtt gga att aac ttc ate tat aac gtt ctt gtc gac ttc 
Asn His Val Gly He Asn Phe He Tyr Asn Val Leu Val Asp Phe 
1370 1375 1380 



6759 



gac gac gtc cag gag gec ctt gec ggg ttc att gag agg cac gga 
Asp Asp Val Gin Glu Ala Leu Ala Gly Phe He Glu Arg His Gly 
1385 1390 1395 



6804 



aag agg ctt tgg cga ctt cga gtg acc g gtaagtgttc teteggcatt 
Lys Arg Leu Trp Arg Leu Arg val Thr 
1400 1405 



6852 



gaattcagca atgagctgtg actaaegggt ttcttcggta tattag ct tct gaa 

Ala Ser Glu 
1410 



6906 



ate cga atg gtt ctt gag gac gac gag ggt aac gtc acc ccc ate 6951 
He Arg Met Val Leu Glu Asp Asp Glu Gly Asn Val Thr Pro He 

1415 1420 1425 

cga tgc tgc att gag aac gtt tct g gtaagcagtc caaaataact 6996 
Arg Cys Cys He Glu Asn Val Ser 

1430 



gataatccta ttcagtctag acattgtaac tgatgcattt ctegttefcta g gt ttc 

Gly phe 
1435 



7052 



gtc gtg aag tac cac gec tac cag gag gtt gag acc gag aag ggt 
Val Val Lys Tyr His Ala Tyr Gin Glu Val Glu Thr Glu Lys Gly 
1440 1445 1450 



7097 



+ 



13 = 01 'WLl i ! 92ssue^dt03 



-56 



act acc ate ttg aag tea ate gga gac ctt gga cct ctt cac ctt 
Thr Thr lie. Leu Lys , Ser He Gly Asp Leu Gly Pro Leu His Leu. 

1455 1460 1465 



7142 



eag cct gte aae eat get tae eag ace aag aae agt ctt eag eee 
Gin Pro Val Ash His Ala Tyr Gin Thr Lys Asn Ser Leu Gin Pro 
1470 1475 1480 



7167 



cga cga tac eag. get 
10 Arg Arg Tyr Gin Ala 

1485 



cac ttg gtt gga acg act tac gtc t 
His Leu Val Gly Thr Thr Tyr Val 
1490 



7227 



getagfceaca tttcacgctc tggttttctg accgceactg gttattgacg ttctgcttgg 7287 



15 cgtcacag ac gac tac ccc gat etc ttc gtt eag agt ttg cgc aag 
Tyr Asp Tyr Pro Asp Leu Phe Val Gin Ser Leu Arg Lys 
. 1495 1500 1505 



7333 



gtt tgg get gag get get get aag att 
20 Val Trp Ala Glu Ala Ala Ala Lys lie 
1510 1515 



cct cac etc egg gtg cct 
Pro His Leu Arg Val Pro 
1520 



7378 



23 



age gag cct ctt acc get acc gag ttg gtt etc gat gag aac aac 
Ser Glu Pro Leu Thr Ala Thr Glu Leu Val Leu Asp Glu Agn Asn 
1525 ,1530 1535 



7423 



30 



gag ett cag gag gtc gag cga ect eeg ggt tee aac teg tgt ggt 
Glu Leu Gin Glu Val Glu Arg Pro Pro Gly Ser Asn Ser Cys Gly 
.1540 1545 1550 



7468 



35 



atg gtc gee tgg ate ttc act atg etc act ccc gag tat ccc aag 7513 
Met Val Ala Trp lie Phe Thr Met Leu Thr Pro Glu Tyr Pro Lys 
1555 1560 1565 

ggt cga cga gta gtt gec att gee aac gat ate acc ttc aag att 7558 
Gly Arg Arg Val val Ala He Ala Asn Asp He Thr Phe Lys lie 
1570 1575 1580 



40 gga tec ttt ggt cct aag gaa gac gat tac ttc ttc aag get act 
Gly Ser Phe Gly Pro Lys Glu Asp Asp Tyr Phe Phe Lys Ala Thr 
1585 1S90 1595 



7603 
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gaa att gco aag aag ctg gge ctt ccfc cga att tac etc tct gee 7648 

Glu He Ala Lys Lys Leu Gly Leu Pro Arg He Tyr *-eu Ser Ala 
1600 1605 1610 

5 aac agt gga get aga etc ggt ate gcg gag gag etc ttg cac ate 7693 
Asn Ser Gly Ala Arg Leu Gly He Ala Glu Glu Leu Leu Hie He 
1615 1620 1625 

ttc aag gcg gee ttc gtt gac ccc gca aag cct tec atg ggt att 7738 
10 Phe Lye Ala Ala Phe Val Asp Pro Ala Lys Pro Ser Met Gly He 
1630 1635 1640 

aag tat eta tac ttg ace cct gaa act tta tec act ctt gee aag 7783 
Lys Tyr Leu Tyr Leu Thr Pro Glu Thr Leu Ser Thr Leu Ala Lys 
15 1645 1650 165,5 

aag gga tec age gtc ace act gag gag ate gag gat gac gge gag 7828 
Lys Gly Ser Ser val Thr Thr Glu Glu He Glu Asp Asp Gly Glu 
1660 1665 1670 

20 

ega cga cac aag ate aee gee ate ate ggt ctt gca gag ggt ttg 7873 
Arg Arg His Lys He Thr Ala He He Gly Leu Ala Glu Gly Leu 
1675 1680 1685 

25 gga gtt gag tct ctt cga gga tec ggt ctt att get gga gee acc 7918 
Gly Val Glu Ser Leu Arg Gly Ser Gly Leu He Ala Gly Ala Thr 
1690 1695 1700 

act cga get tac gag gag gga ate ttc acc ate tct etc gtt act 7963 

30 Thr Arg Ala Tyr Glu Glu Gly He Phe Thr He Ser Leu Val Thr 

1705 1710 1715 

gec cga teg gtc ggt ate gga get tac ttg gtt cga ttg ggt cag 8008 

Ala Arg Ser Val Gly lie Gly Ala Tyr Leu Val Arg Leu Gly Gin 

1720 172S 1730 

35 

cga get att cag gtt gaa gge aac eet atg ate ctt act gga get 8053 

Arg Ala He Gin Val Glu Gly Asn Pro Met He Leu Thr Gly Ala 

1735 1740 1745 

40 cag tct etc aac aag gtg ctt gga ega gag gtt tac act tec aac 8098 
Gin Ser Leu Asn Lys val Leu Gly Arg Glu val Tyr Thr Ser Asn 
1750 1755 1760 



# 



-58,- 

cct cag ctt gga gga acc cag att atg gcc cga aac ggt acc acg 
Leu Gin Leu Gly Gly Thr Gin He Met Ala Arg Asn Gly Thr Thr 
1765 1770 1775 



8143 



5 cat etc gtc get gaa tct gat etc gat ggt get etc aag gtc ate 
His Leu Val Ala Glu Ser Asp Leu Asp Gly Ala Leu Lys Val lie 
1780 1785 1700 



6188 



cag tgg etc teg tat gtg ccc gag cga aag ggc aag gcc att cct 
10 Gin Trp Leu Ser Tyr Val Pro Glu Arg Lys Gly Lys Ala lie Pro 
1795 1800 180S 



8233 



1$ 



ate egg cct ccc gag gac cct tgg gac cga act gtg acc tac gag 
He Trp Pro Ser Glu Asp Pro Trp Asp Arg Thr Val Thr Tyr Glu 
1810 1815 1820 



8278 



20 



cct ccc cga ggt 
Pro Pro Arg Gly 
1825 

ccg gat gaa ggc 
Pro Asp Glu Gly 
1840 



cct tac gat cct cga tgg ttg ctt gaa gga aag 
Pro Tyr Asp Pro Arg Trp Leu Leu Glu Gly Lys 
1830 1835 



ttg act ggt ctt ttc 
Leu Thr Gly Leu Phe 
1845 



gac aag gga tct ttc atg 
Asp Lys Gly Ser Phe Met 
1850 



8323 



8368 



25 gag acc ctt gga gat tgg gcc aag act ate gtc acc ggt cga gcc 
Glu Thr Leu Gly Asp Trp Ala Lys Thr He Val Thr Gly Arg Ala 
1855 1860 1865 



8413 



30 



35 



cga ctg gga ggc 
Arg Leu Gly Gly 
1870 
acg acc gag aag 
Thr Thr Glu Lys 
. .-1885 

ttc gag caa aag 
Phe Glu Gin Lys 
1900 



att cct atg ggt gtt 
He Pro Met Gly Val 
1875 

ate ate get gcc gat 



att get gtc gaa acc 
lie Ala Val Glu Thr 
1880 

cct gcc aac cct gca 



He He Ala Ala Asp pro Ala Asn Pro Ala 
1890 . 1895 



agg 

Arg 

get 
Ala 



att atg gag get ggt cag gtt tgg aac ccc aac 
He Met Glu Ala Gly Gin Val Trp Asn Pro Asn 
1905 1910 



8458 



8503 



8548 



40 



get get tac aag acc get caa tec ate ttt gat ate aac aag gag 
Ala Ala Tyic Lys Thr Ala Gin Ser lie Phe Asp He Asn Lys Glu 
1^15 1920 1925 



8593 
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ggt ctt cct ttg atg ate ctt gec aac ate cga ggt etc tct gga 
Gly Leu Pro Leu Met Xle Leu Ala Asn He Arg Gly Phe Ser Gly 
1930 1935 1940 



8636 



gga cag ggt gat atg ttt gac get ata etc aag cag ggt tefc aag 
Gly Gin Gly Asp Met Phe Asp Ala He Leu Lys Gin Gly Ser Lys 
1945 1950 1955 



8683 



ate gtt gac ggt etc teg aac ttc aag 
10 He val Asp Gly Leu Ser Asn Phe Lys 
1960 1965 



cag eea gtg etc gtc tat 
Gin Pro Val Phe Val Tyr 
1970 



8728 



15 



gtt gtc ccc aac gga gag ctt cgt gga gga get tgg gtc gtg ttg 
Val Val Pro Asn Gly Glu Leu Arg Gly Gly Ala Trp Val Val Leu 
1975 1980 1985 



8773 



20 



gat cct act ate 
Asp Pro Thr Xle 
1990 

acc get cga gga 
Thr Ala Arg Gly 
2005 



aac ctt gec aag atg gag atg tac get gat gaa 
Asn Leu Ala Lys Met Glu Met Tyr Ala. Asp Glu 
1995 2000 

gga att etc gag ceg gaa ggt ate gtt gag ate 
Gly He Leu Glu Pro Glu Gly He Val Glu He 
2010 2015 



8818 



8863 



25 aag ttc cga cga gac aag gtc ate get acc atg gag cga ttg gac 
Lys Phe Arg Arg Asp Lys Val He Ala Thr Met Glu Arg Leu Asp 
2020 2025 2030 



8908 



30 



35 



gag acc tat gec 
Glu Thr Tyr Ala 

2035 
tct gcg gag gag 
Ser Ala Glu Glu 

2050 

gag act eta ctt 
Glu Thr Leu Leu 
2065 



tct etc aaa get gec teg aac gac tea acc aag 
Ser Leu Lys Ala Ala Ser Asn Asp Ser Thr Lys 

2040 2045 
cga get aag agt get gag eta etc aag gca aga 
Arg Ala Lys Ser Ala Glu Leu Leu Lys Ala Arg 

2055 2060 

caa ecg aeg tac ttg cag att gca cac ctt tac 
Gin Pro Thr l*yr Leu Gin He Ala His Leu Tyr 
2070 2075 



8953 



8998 



9043 



40 get gat etc cat gat cgt gtc gga cga atg gag gee aag ggt tgc 
Ala Asp Leu His Asp Arg Val Gly Arg Met Glu Ala Lys Gly Cys 
2080 2085 2090 



9088 
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geg aag cga get gtc tgg get gag get cga cga ttc ttc tac tgg 9133 

Ala Lys Arg Ma Val Tip Ala Glu Ala Arg Arg Phe Phe Tyr Tip 
2095 2100 2105 

5 cga ctt cga cga cgt etc aac gat gag gtgagccgtc ccattcactc 9180 
Arg Leu Arg Arg Arg Leu Asn Asp Glu 
2110 2115 

tttcgttgca aggttcagta gtactaaccg cttctttctt tatefcatcag cac ate 9236 
10 His 116 



ctg tct aag ttc get get gec aac ccg gat ctt act etc gag gag 9281 
Leu Ser Lys Phe Ala Ala Ala Apn Pro Asp Leu Thr Leu Glu Glu 
15 2120 2125 2130 

cga caa aac att etc gae tct gtc gtc cag act gac etc act gat 9326 

Arg Gin Asn lie Leu Asp Ser Val Val Gin Thr Asp Leu Thr Asp 
2135 2140 2145 

20 

gac cga gec acc get gaa tgg att gag cag tct gca gaa gag att 9371 

Asp Arg Ala Thr Ala Glu Trp tie Glu Gin Ser Ala Glu Glu lie 
2150 21S5 2160 

get get gec gtt gec gaa gtc cga : tec ace tac gtg teg aat aag 9416 
Ala Ala Ala Val Ala Glu Val . Arg Ser Thr Tyr Val Ser Asn Lys 
2165 2170 2175 

att ate age ttc gec gag acg gag cga get gga gcg ttg cag gge 9461 
lie lie Ser Phe Ala Glu Thr Glu Arg Ala Gly Ala Leu Gin Gly 

2180 2185 2190 

ttg gtc get gtc ttg age act,.ttg aat gcg gaa gac aag aag gec 9506 
Leu val Ala Val Leu Ser Thr Leu Asn Ala Glu Asp Lys Lys Ala 
2135 2200 2205 

ctt gtt tct age ctt ggt etc taa attttaactt tttttgtcga tgetattett 9560 
Leu Val ser Ser Leu Gly Leu 
2210 

cctatcttta gtctttgatc aacttttgaa tatccttcat agatctttcc ttgeatacat 9620 



tgatattatt tcctcacccg tttttatgta cttccatacg agtttccatt tttttctget 9680 



^ 15:01 
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tctataetttc gactacacgt cgactgttca cctgccCctc ttttgttctt tctgttctgt 9740 

tttcttctgt tctttcgect ettgggattc tataetetce tecgeate.ta cacatgetea SB00 

tgetaacgtc egaetcagag ctcaceagga tatgtcgtga gagcccgaaa caagttgcac 9860 

aacatatatt gataatgatc agaacactct aagaccaccc agfcccatgat cagccgcatc .9920 

10 gccagtttcg atctcttctc cattctcetc aacctcaatc tcctcccgga tcgtc.ctgcc 9980 

eagcagactg ccgaataact cgccgacctg ctccccctgc cacaagtcet ccgttcgctc 10040 

aggaaecatg aagtteatga tctttttcttg gggggtatat cgaagcttgc gacetttaga 10100 

15 

agctcgtgta tcgagggtgg gcctgcgctt tttgggtccg taattggaaa aggttgcttg 10160 

1 

gcetattttca aaataaacga aatcgacgat catacaccgc cgtagaccgt ttctggtcag 10220 

20 gattttgtgt tggacgatga tatacegate gatgtttgag eagaeaaggg agtcaggaag 10280 

agaetactta eeacfceatag cgcegacccc agcaccfccca cctcttcgct cgattgacgtc 10340 

cetgaccaag ctccggtaaa aetettfcgtc atcaccccaa acggcggcct cacattcagc 10400 

25 

ctcatcctga gagacgagtc ccatgaaccg atetaefcttt tteetaccec ctagacccte 10460 

aagggaagct ccaatttgct cgacgactcc gatcttgacg gatttaaacfc tttcaecteg 10520 

30 aagattctga aggcectgag eggteataat cttggaagac c 10561 



<210> 2 

<211> 6645 

35 <212> DNft. 

<213> Phaffia rhodpayma 
<220> 

<221> CDS 

<222> (1)..(664S) 
40 <;223> 
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<400> 2 

atg gtt gtc gat cac gag age gta agg cat ttc ate ggt gga a ac gca 
Met Val Val Asp His Glu Ser val Arg His phe lie Gly Gly Asn Ala 
1 5 10 is 



10 



15 



25 



30 



ate gtt gac att gec gag cga ttc aat ata cat get gtt tgg get gga 
lie Val Asp lie Ala Qlu Arg Phe Asn lie His Ala Val Trp Ala Gly 
115 120 125 



35 



40 



48 



ctt gag aac gec cct ccg tea age gtc acc gat ttc gtt aga agt caa $6 
Leu Glu Asn Ala Pro Pro Ser Ser Val Thr Asp Ph« Val Arg s«r Gin 
20 25 ao 

gat ggt eae aeg gtc ate ace aaa gtc etc att gee aac aac gga ate 144 
Asp Gly His Thr Val He Thr Lys Val Leu He Ala Asn Asn Gly He 
35 40 45 

get get gta aaa gag ate cga tea gtt cgt aaa tgg get tac gag acg 192 
Ala Ala Val Lys Glu He Arg Ser Val Arg Lys Trp Ala Tyr Glu Thr 
50 55 60 



ttt gga gat gag cga gec ate gaa ttt acg gta atg gee act cca gaa 240 
Phe Gly Asp Glu Arg Ala He Glu phe Thr val net Ala Thr pro Glu 
20 65 70 75 80 

gat ttg aag gtg aac tgc gac tat att cga atg get gat cga gte gtc 288 
Asp t,eu Lys val Asn Cys Asp Tyr He Arg Met Ala Asp Arg Val Val 
85 30 g5 



gaa gtt cct gga gga act aac aac aac aat cac tct aac gtc gac etc 336 
Glu Val Pro Gly Gly Thr Asn Asn Asn Asn His Ser Asn Val Asp Leu 
100 105 no 



384 



tgg ggt eae get teg gaa aac cec aga ctt ccc gag tct etc gec gec 432 
Trp Gly His Ala Ser Glu Asn Pro Arg Leu Pro Glu Ser Leu Ala Ala 
3-30 135 140 

tea aag aac aag ate gtc ttc att ggt cct ccc gga tec get atg cga 480 
Ser Lys Asn Lys He Val Phe He Gly Pro Pro Gly Ser Ala Met Arg 
145 150 155 leo 

tec ctt gga gae aag att tct teg ace ate gtt gec cag tot gee cag 528 
Ser Leu Gly Asp Lys He Ser Ser Thr He val Ala Gin Ser Ala Gin 
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165 170 X75 

gtg ccg tgt atg gcc tgg tct gga tea ggc ate act gat aca gag etc 576 
Val Pro Cys Met Ala Trp Ser Gly Ser Gly lie Thr Asp Thr Glu Leu 
5 180 185 190 

age cct cag ggc ttc gtg act gtg ccc gat ggg cca tat cag get get 624 
Ser Pro Gin Gly Phe Val Thr Val Pro Asp Gly Pro Tyr Gin Ala Ala 
195 200 205 

10 

tgt gta aag acg gtg gag gat ggt ttg gtg cga gcc gag aag ate ggt 672 
Cys Val Lys Thr Val Glu Asp Gly Leu Val Arg Ala Glu Lys lie Gly 
210 215 220 

15 ttg cca gtt atg ate aag gee tet gag gga gga gga gga aag ggt ate 720 
Leu Pro val Met lie Lys Ala Ser Glu Gly Gly Gly Gly Lys Gly lie 
225 230 235 240 

cga atg gtt cac age atg gac aca ttc aag aac tec tac aac tec gte 768 
20 Arg Met Val His Ser Met Asp Thr Phe Lys Asn Ser Tyr Asn Ser Val 

245 250 255 

get tec gag gtg cca gga tct ceg att ttc ate atg gee ttg get gga 816 
Ala Ser Glu Val Pro Gly Ser Pro lie Phe lie Met Ala Leu Ala Gly 
25 260 265 270 

tct get cga cat ttg gag gte cag etc ctt get gat cag tac gga aac 864 
Ser Ala Arg His Leu Glu Val Gin Leu Leu Ala Asp Gin Tyr Gly Asn 
275 280 285 

30 

get ate tct ttg ttc ggt cga gat tge tct gtt cag cga cga cat cag 912 
Ala lie Ser Leu Phe Gly Arg Asp Cys Ser Val Gin Arg Arg His Gin 

290 295 300 

aag ate att gag gag get ccc gte acg ate get egt cca gag aga ttc 960 
35 Lys He lie Glu Glu Ala pro Val Thr He Ala Arg Pro Glu Arg Phe 
305 310 315 320 

gaa gag atg gag aag get get gte agg ttg gcc aag tta gta gga tat 1008 
Glu Glu Met Glu Lys Ala Ala Val Arg Leu Ala Lys Leu Val Gly Tyr 
40 325 330 335 

gtt agt gcc ggt acc gte gaa tac etc tac tct cac gcc gac gac tea 1056 
Val Ser Ala Gly Thr Val Glu Tyr Leu Tyr Ser His Ala Asp Asp Ser 
340 345 350 
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10 



15 



25 



30 



40 



ttc tec tec ecu gaa etc aac cct cga ctt caa gtc gag cae cet act . iig« 
Phe Plie Phe Leu Glu Leu Asn Pro Arg Leu Gin Val Glu His Pro Thr 
355 360 365 

acc gag atg gtc teg ggt gtc aac ctt ccc get get cag ctt cag att 1152 
Thr Glu Met val Ser Gly Val Asn Leu Pro Ala Ala Gin Leu Gin He 
370 375 380 

get atg ggt ate cct ctt tct cga att egg gat att cga gtc etc tac 1200 
Ala Met Gly lie Pro Leu Ser Arg He Arg Asp lie Arg Val Leu Tyr 
3flS 390 39S 400 

ggt etc gat ccc cac act gtt tec gag ate gac ttc gac age age aga 1248 
Gly Leu Asp. Pro His Thr val Ser Glu He Asp Phe Asp Ser Ser Arg 
405 410 415 



gcg gag tct gtc. cag act cag agg aag cct agg ccc aag ggt cac gtc 
Ala Glu Ser Val Gin Thr Gin Arg Lys Pro Arg Pro Lys Gly His Val 
20 420 425 430 



tgg gga tac ttc tct gtt gga get act gga gga att cat agt ttc gec 
Trp Gly Tyr Phe Ser val Gly Ala Thr Gly Gly He His Ser Phe Ala 
465 " 470 475 480 

gat cct caa ttc ggt cac gtg ttt get tat ggc tec gac cga acg act 
Asp Ser Gin Phe Gly His Val Phe Ala Tyr Gly Ser Asp Arg Thr Thr 
33 485 490 495 



1296 



att gec tgt cga ate acg agt gaa aac ccc gat gag ggg ttc aag ccg 1344 
He Ala Cys Arg He Thr Ser Glu Asn Pro Asp Glu Gly Phe Lys Pro 
435 440 445 

tct gec gga gat ate caa gag ttg aac ttc aga agt aat act aac gtc 1392 
Ser Ala Gly Asp He Gin Glu Leu Asn Phe Arg Ser Asn Thr Asn Val 
450 455 460 



1440 



1488 



gec aga aag aat acg gtt ate gec ttg aaa gag ctt tec att cga gga 1536 
Ala Arg Lys Asn Met Val He Ala Leu Lys Glu Leu Ser He Arg Gly 
500 505 510 

gae ttc cga acc act gtc gag tat ctt ate act ctt cct gag acg age 1584 
Asp Phe Arg Thr Thr Val Glu Tyr Leu zle Thr Leu Leu Glu Thr Ser 
515 520 525 



-65 



gat ttc gag* cag aac gee att ace aec gee tgg teg gat ggg ttg ate 1632 
Asp Phe Glu Gin Asn Ala lie Thr Thr Ala Trp Leu Asp Gly Leu lie 
530 S35 540 

act aac aag ctt aca tet gag agg cct gat cca tea ctg gee gtt afct ,1680 
Thr Asn Lys Leu Thr Ser Glu Arg Pro Asp Pro Ser Leu Ala val He 
545 550 5S5 560 

tgt ggt gca att gtg aaa get cac gtg get tefc gag aae tgt tgg gec 1728 
Cys Gly Ala lie Val Lys Ala His Val Ala Ser Glu Asn Cys Trp Ala 
565 570 575 

gaa tac cga cga gta ttg gac aag gga cag gtt ccc tec aag gac act 1776 
Glu Tyr Arg Arg val Leu Asp Lys Gly Gin Val Pro Ser Lys Asp Thr 
5B0 585 590 

etc aag aca gtg ttc act ctt gat ttc ate tat gag ggt gtt egg tac 1824 
Leu Lys Thr Val Phe Thr Leu Asp Phe lie Tyr Glu Gly val Arg Tyr 
535 600 605 



aat ttc ace get get cga gee tec etc aac act tac cga ttg tat eta 
Asn Phe Thr Ala Ala Arg Ala Ser Leu Asn Thr Tyr Arg Leu Tyr Leu 
610 615 620 



1872 



aac gga gga aag acc gtg gtg tec ate cga cct ttg gec gat ggt gga 1920 
Asn Gly Gly Lys Thr val Val Ser lie Arg Pro Leu Ala Asp Gly Gly 
625 630 635 640 

atg etc gtt ctt etc gat ggc cga tec cac act etc tac tgg agg gag 1968 
Met Leu Val Leu Leu Asp Gly Arg Ser His Thr Leu Tyr Trp Arg Glu 

645 650 655 

gaa gtc ggt acc etc cga act cag gta gac gca aag act tgc ctg att 2016 
Glu Val Gly Thr Leu Arg lie Gin Val Asp Ala Lys Thr Cys Leu lie 
660 665 670 

gag cag gag aac gac ccc act cag etc cga tea ccc teg cct gga aag 2064 
Glu Gin Glu Asn Asp Pro Thr Gin Leu Arg Ser Pro Ser Pro Gly Lys 
675 680 685 

ate ate egg ttt ttg gtc gaa age gga gat cac ate tec tec gga gat 2112 
Xle Tie Arg Phe Leu Val Glu Ser Gly Asp His lie Ser Ser Gly Asp 
690 69S 700 
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ate tat get gag gtt gag gtc atg aag atg ate teg ccc ttg att gec 2160 
. Ho. Tyr Ala Glu Val Glu Val Met Lys Met lie Leu Pro Leu lie Ala 
70S 710 715 720 

cag gag tec ggt cac gtt cag ttt gtc aag caa gec ggt gtg acc gtc 2208 
Gin Glu Ser Gly His Val Gin Phe Val Lya Gin Ala Gly Val Thar Val 
725 730 735 

gat cct gga gcgr afct att ggg ate ttg agt ctt gat gae ecc acg cga 2256 
Asp Pro Gly Ala , He lie Gly He Leu Ser Leu Asp Asp Pro Thr Arg 
740 745 750 

gtg aag aag geg aag ccc ttc gag ggt etc ctg ect gtg act ggt etc 2304 
Val Lys Lys Ala Lys Pro Phe Glu Gly Leu Leu Pro Val Thr Gly Leu 
755 760 765 

cct aac ctg ccc ggt aac aga cct cac cag egg eta cag ttc cag ctt 2352 
Pro Asn Leu Pro Gly Asn Arg Pro His Gin Arg Leu Gin Phe Gin Leu 
770 775 780 

gag teg ata tac teg gtc ttg gat gga tac gag agt gac tec act gca 2400 
Glu Ser He Tyr Ser Val Leu Asp Gly Tyr Glu Ser Asp Ser Thx Ala 
785 790 795 800 

aca ate etc cga tea ttc tct gaa aac ctt tat gat cct gat ctt get 2448 
Thr He Leu Arg Ser Phe Ser Glu -Asn Leu -Tyr Asp Pro Asp Leu Ala 
805 810 815 

ttc gga gag get tta tec ate att tec gtc ctt tct ggg aga atg cct 2496 
Phe Gly Glu Ala Leu Ser He lie Ser Val Leu Ser Gly Arg Met Pro 

820 825 830 

gee gat ctt gag gag age att cga gag gtc ate age gaa get cag teg 2544 
Ala Asp Leu Glu Glu Ser He Arg Glu Val He Ser Glu Ala Gin Ser 
835 840 845 

aag cct cac gec gag ttc cct gga tea aag ate etc aaa gtc gtc gag 2592 
Lys Pro His Ala Glu Phe Pro Gly Ser Lys He Leu Lys Val Val Glu 
850 855 860 

egg tac ate gat aat ttg cga cct cag gag agg get atg gtc cga act 2640 
Arg Tyr He Asp Asn Leu Arg Pro Gin Glu Arg Ala Met Val Arg Thr 
865 870 875 880 



* 
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cag ate gaa ccc ate gtt ggt att get gag aag aac gtt ggc ggt cct 
Gin He Glu Pro lie Val Gly He Ala Glu Lys Asn Val Gly Gly Pro 
885 890 895 



2686 



aag ggt tac gec tct tac gtc tta get ace ate ctt caa .aag ttc ttg 
Lys Gly Tyr Ala Ser Tyr val Leu Ala Thr He Leu Gin Lys Phe Leu 
900 905 910 



2736 



gec gtt gag gec gtt ttt get act ggt agt gaa gag gec att gtt etc 
10 Ala Val Glu Ala Val Phe Ala Thr Gly Ser Glu Glu Ala He Val Leu 
915 920 925 



2784 



15 



caa ctt cga gat gaa aac cga gaa tct ttg aac gac gtc ctt ggt etc 
Gin Leu Arg Asp Glu Asn Arg Glu Ser Leu Asn Asp Val Leu Gly Leu 
930 935 940 



2832 



20 



gtc ctg get cac teg cgt etc age get cga tec aag ctt gtt etc tec 2880 
val Leu Ala His Ser Arg Leu Ser Ala Arg Ser Lys Leu Val Leu Ser 
945 950 955 960 

gtc ttt gat ctg ate aag tct atg cag etc etc aac aac act gag ggt 2928 
Val Phe Asp Leu He Lys Ser Met Gin Leu Leu Asn Asn Thr Glu Gly 
965 970 975 



25 tct ttc ctt cat aag act atg aaa gcg ctt gee gac atg ccc ace aag 
Ser Phe Leu His Lys Thr Met Lys Ala Leu Ala Asp Met Pro Thr Lys 
980 985 990 



2976 



30 



35 



get cct ttg gec age aag gtg tct ttg aag get egg gaa att ctt ate 
Ala Pro Leu Ala Ser Lys val Ser Leu Lys Ala Arg Glu He Leu lie 
995 1000 1005 



tct tgc tct ctt ccc tct tac 
Ser Cys Ser Leu Pro Ser Tyr 
1010 1015 

aag ate ctt aac tct tct gtc 
Lys He Leu Asn Ser Ser Val 
1025 1030 



gag gag agg ttg ttc 
Glu Glu Arg Leu Phe 
1020 

acc act tct tac tac 
Thr Thr Ser Tyr Tyr 
1035 



cag atg gaa 
Gin Met Glu 



gga gag act 
Gly Glu Thr 



3024 



3069 



3114 



40 gga ggt gga cac aga aac cct teg gtt gat gtt ctg act gag ate 
Gly Gly Gly His Arg Asn Pro Ser Val Asp Val Leu Thr Glu He 
1040 1045 1050 



3159 
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tea aac tct cga ttc acc gtc tac gat gtc ctg tec tec ttc ttc 3204 
Ser Asn Ser Arg Phe Thr Val Tyr Asp Val Leu Ser Ser Phe Phe 
10SS 1060 1065 

5 aag cac gat gat cct tgg att gtt ctt get agt ttg ace gtc tae 3249 
Lys His Asp Asp Pro Trp lie Val Leu Ala Ser Leu Thr Val Tyr 

1070 1075 ioao 



gtt ctt cga get tac cga gag tac agt att ctt gat atg caa cat 
10 Val Leu Arg Ala Tyr Arg Glu Tyr Ser lie Leu Asp Met Gin His 
1085 1030 1095 



3294 



15 



gag caa ggtf cag gaf ggc get* get "gga gtc ate act tgg cga ttc 
Glu Gin Gly Gin Asp Gly Ala Ala Gly Val He Thr Trp Arg Phe 
1100 1105 - mo 



3339 



20 



aag etc aac cag ccc ate get gag tct tct act ccc cga gtt gac 
Lys Leu Asn Gin Pro lie Ala Glu Ser Ser Thr Pro Arg Val Asp 
1115 1120 H25 

teg aat cga gac gtt tac cga gtc ggt teg ctt tct gat ttg acc 
Ser Asn Arg Asp Val Tyr Arg Val Gly Ser Leu Ser Asp Leu Thr 
1130 1135 H40 



3384 



3429 



*5 tac aag ate aag cag agt cag acc gag ccc etc cga get ggt gtc 
Tyr Lys Xle Lys Gin Ser Gin Thr Glu Pro Leu Arg Ala Gly Val 
1145 1150 1155 



3474 



atg acg age ttc aac aac ttg aag gag gtt cag gac gga etc ttg 
I Met Thr Ser Phe Asn Asn Leu Lys Glu Val Gin Asp Gly Leu Leu 
1160 1165 1170 

aat gtt ctg tct ttc ttc cct get tac cat cat caa gat ttc act 
Asn Val Leu Ser Phe Phe pro Ala Tyr His His Gin Asp Phe Thr 
H75 1180 1185 



3519 



3564 



caa cga cat ggt cag gac agt gee atg ccc aac gtt etc aac att 
Gin Arg His Gly Gin Asp ser Ala Met Pro Asn val Leu Asn He 
1190 1135 1200 



3609 



get ate egg get ttc gag gag aag gac gac atg tct gat ctt gat 
Ala He Arg Ala Phe Glu Glu Lys Asp Asp Met Ser Asp Leu Asp 
1205 1210 1215 



3654 
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cgg gcc aag agt gtt gag teg ctg gta atg cag atg- tct gec gag 
Tip Ala Lys Ser Val Glu Ser Leu val Met Gin Met Ser Ala Glu 
1220 1225 1230 



ate cag aag aag gga att cga cga gtt acc ttc tfcg gtt tgc cga 
lie Gin Lys Lys Gly He Arg Arg Val tfhr Phe Leu Val Cys Arg 
1235 1240 1245 



3744 



aag ggc gtt tac ccc tec tac ttc acc ttc aga caa gag ggt gcc 
10 Lys Gly Val Tyr Pro Ser Tyr Phe Thr Phe Arg Gin Glu Gly Ala 
1250 1255 1260 



3789 



15 



cag ggc ccc tgg aga gag gag gag aag att cga aac acc gag cct 
Gin Gly Pro Trp Arg Glu Glu Glu Lys He Arg Asn He Glu Pro 
1265 1270 1275 



3634 



20 



get eta gcc agt cag ctt gag etc aac cga etc teg 

Ala Leu Ala Ser Gin Leu Glu Leu Asn Arg Leu Ser 
12d0 1265 1290 

gtc acc cct ate ttc gta gac aac aga cag ate cac 

Val Thr Pro lie Phe val Asp Asn Arg Gin He His . 
1295 1300 1305 



aat ttc aag 
Asn Phe Lys 



ate tac aag 
He Tyr Lys 



3879 



3924 



25 gga gtg ggt aag gag aac tct tec gat gtt cga ttc ttt ate egg 
Gly Val Gly Lys Glu Asn Ser Ser Asp Val Arg Phe Phe He Arg 
1310 1315 132 0 



3969 



30 



35 



get ttg gtt cga cct gga egg 

Ala Leu Val Arg Pro Gly Arg 

1325 1330 

gag tat etc ate tec gag tgc 

Glu Tyr Leu lie Ser Glu Cys 

1340 1345 

gac gcc ttg gag gtt gtt gga 

Asp Ala Leu Glu Val Val Gly 
1355 1360 



gtc cag gga teg atg 
Val Gin Gly Ser Met 
1335 

gat cga ctg etc act 
Asp Arg Leu Leu Thr 
1350 



aag get gcc 
Lys Ala Ala 

gat ate ctg 
Asp He Leu 



gcc gag act cga aac gcc gat tgc 
Ala Glu Thr Arg Asn Ala Asp Cys 
1365 



4014 



4059 



4104 



40 



aac cat 
Asn His 
1370 



gtt gga att aac etc ate tat aac gtt ctt gtc gac ttc 
Val Gly He Asn Phe He Tyr Asn val Leu Val Asp Phe 
1375 1380 



4149 
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gac gac gte cag gag gcc ctt gcc ggg tec att gag agg cac gga 

Asp Asp Val Gin Glu Ala Leu Ala Gly Phe lie Glu Arg His Gly 
1385 * 1390 1395 



4194 



aag agg ctt tgg cga ctt cga gtg acc get tet gaa ate cga atg 
Lys Arg Leu Trp Arg Leu Arg Val Thr Ala Ser Glu He Arg Men 
1400 1405 1410 



4239 



gtt ctt 
10 Val Leu 
1415 



gag gac gac gag ggt aac gtc acc ccc ace cga tgc tgc 
Glu Asp Asp Glu Gly Asn Val Thr Pro He Arg Cys Cys 
1420 142S 



4264 



13 



ate gag aac gtt tct ggt ttc gtc gtg aag tac cac gcc tac cag 
He Glu Asn val Ser Gly Phe val Val Lys Tyr His Ala Tyr Gin 
1430 1435 1440 



4323 



20 



gag gtt 
Glu yal 
1445 

gac ctt 
Asp Leu. 
1460 



gag acc gag aag ggt act acc ate ttg aag tea ate gga 
Glu Thr Glu Lys Gly Thr Thr He Leu Lys Ser He Gly 
1450 1455 

gga cct ctt cac ctt cag cct gtc aac cat get tac cag 
Gly Pro Leu His Leu Gin Pro Val Asn His Ala Tyr Gin 
1465 1470 



4374 



4419 



25 acc aag aac agt ctt cag ccc cga cga tac cag get cac ttg gtt 
Thr Lys Asn Ser Leu Gin Pro Arg Arg Tyr Gin Ala His Leu Val 
1475 1480 1485 



4464 



30 



35 



gga acg 
Gly Thr 

1490 
ttg egc 
Leu Arg 

1505 

egg gtg 
Arg yal 
1520 



act tac gtc tac gac tac ccc gat etc ttc 
Thr Tyr Val Tyr Asp Tyr Pro Asp Leu Phe 

1495 1500 
aag gtt tgg get gag get get get aag att 
Lys Val Trp Ala Glu Ala Ala Ala Lys He 

1510 1S15 



gtt cag agt 
Val Gin Ser 

cct cac etc 
Pro His Leu 



cct age gag cct ctt acc get ace gag ttg gtt etc gat 
Pro Ser Glu Pro Leu Thr Ala Thr Glu Leu Val Leu Asp 
1525 1530 



4509 



4554 



4599 



40 



gag aac aac gag ctt cag gag gtc gag cga cct ccg ggt tee aac 
Glu Asn Asn Glu Leu Gin Glu Val Glu Arg Pro Pro Gly Ser Asn 
1535 1540 1545 



4644 



35 
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teg tgt ggt afcg gtc gcc egg ate ttc ace atg etc act ccc gag 4689 
Ser Cys Gly Met Val Ala Trp lie Phe Thr Met Leu Thr Pro Glu 
1550 1555 1560 

5 tat ccc aag ggt cga cga gta gtt gcc att gcc aac gat ate acc 4734 
Tyr Pro Lys Gly Arg Arg Val Val Ala lie Ala Asn Asp lie Thr 
1565 1570 1575 

ttc aag att gga tec ttt ggt cct aag gaa gac gat tac ttc ttc 4779 
10 Phe Lys lie Gly Ser Phe Gly Pro Lys Glu Asp Asp Tyr Phe Phe 
1580 1585 1590 



4824 



aag get act gaa att gcc aag aag etg ggc ctt cct cga att tac 
Lys Ala Thr Glu He Ala Lys Lys Leu Gly Leu Pro Arg He Tyr 
15 1595 1600 1605 

etc tct gcc aac agt gga get aga etc ggt ate gcg gag gag cte 4869 

Leu Ser Ala Asn Ser Gly Ala Arg Leu Gly He Ala Glu Glu Leu 
1610 1615 1620 

20 

ttg cac ate ttc aag gcg gcc ttc gtt gac eec gca aag cct tec 4914 

Leu His He Phe Lys Ala Ala Phe Val Asp Pro Ala Lys Pro Ser 
1625 1630 1635 

25 atg ggt att aag tat eta tac ttg acc cct gaa act tta tec act 4959 
Met Gly He Lys Tyr Leu Tyr Leu Thr Pro Glu Thr Leu Ser Thr 
1640 1645 1650 

ctt gcc aag aag gga tec age gtc acc act gag gag ate gag gat 5004 
30 Leu Ala Lys Lys Gly Ser Ser Val Thr Thr Glu Glu He Glu Asp 

1655 1660 1665 

gac ggc gag cga cga cae aag ate acc gcc ate ate ggt ctt gca. 5049 
Asp Gly Glu Arg Arg His Lys He Thr Ala He He Gly Leu Ala 

1670 1675 1680 



gag ggt ttg gga gtt gag tct ctt cga gga tec ggt ctt att get 5094 
Glu Gly Leu Gly Val Glu Ser Leu Arg Gly Ser Gly Leu He Ala 
1685 1690 1695 



40 



gga gee acc act cga get tac gag gag gga ate ttc acc ate tct 
Gly Ala Thr Thr Arg Ala Tyr Glu Glu Gly He Phe Thr He Ser 
1700 1705 1710 



5139 
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cte gtt acc gec cga teg gtc ggt ate gga get tac teg get cga 
Leu Val Thr Ala Arg ser Val Gly He Gly Ala Tyr Leu Val Arg 
1715 1720 1725 



5184 



5 ttg ggt cag cga get att cag gtt gaa ggc aac cct atg ate ett 
Leu Gly Gin Arg Ala He Gin Val Glu Gly Asn Pro Met He Leu 
1730 1735 1740 



5229 



act gga get cag tct etc aac aag gtg cct: gga cga gag gtt tac 
10 Thr Gly Ala Gin Ser Leu Asn Lys Val Leu Gly Arg Glu val Tyr 
1745 1750 1755 



5274 



15 



act tec aac ctt cag ctt gga gga acc cag att atg gec cga aac 
Thr Ser Asn Leu Gin Leu Gly Gly Thr Gin He Met Ala Arg Asn 
1760 1765 1770 



5319 



20 



ggt acc acg cat etc gtc get gaa tct gat etc gat 
Gly Thr Thr His Leu val Ala Glu Ser Asp Leu Asp 
1775 1780 1785 

aag gtc ate cag tgg etc teg tat gtg ccc gag cga 
Lys Val Xle Gin Trp Leu Ser Tyr Val Pro Glu Arg 
1790 1795 1800 



ggt get etc 
Gly Ala Leu 



aag ggc aag 
Lys Gly Lys 



53 64 



5409 



25 gec att cct ate tgg cct tec gag gac cct tgg gac cga act gtg 
Ala He Pro He Trp Pro ser Glu Asp Pro Trp Asp Arg Thr Val 
1805 1810 1815 



5454 



30 



35 



acc tac gag cct ccc cga ggt 
Thr Tyr Glu Pro Pro Arg Gly 

1820 182S 
gaa gga aag ccg gat gaa ggc 
Glu Gly Lys Pro Asp Glu Gly 

1835 1640 

tct ttc atg gag acc ctt gga 
Ser Phe Met Glu Thr Leu Gly 
1B50 1855 



cct tac gat cct cga 
Pro Tyr Asp Pro Arg 
1830 

ttg act ggt ett ttc 
Leu Thr Gly Leu Phe 
1845 



tgg ttg ctt 
Trp Leu Leu 

gac aag gga 
Asp Lys Gly 



gat tgg gec aag act ate gtc acc 
Asp Trp Ala Lys Thr He val Thr 
I860 



5499 



5544 



5589 



40 



ggt cga gec cga ctg gga ggc 
Gly Arg Ala Arg Leu Gly Gly 
1865 1870 



att cct atg ggt gtt att get gtc 
He Pro Met Gly Val He Ala Val 
1875 



5634 



* 
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gaa acc agg acg acc gag aag ate ate get gee gat cct gec aac 
Gin Thr Arg Thr Thr Glu Lys He Xle Ala Ala Asp pro Ala Asn 
1880 1885 1890 



5679 



cct gca get ttc gag caa aag att atg gag get ggt cag gtt tgg 
Pro Ala Ala Phe Glu Gin Lys He Met Glu Ala Gly Gin Val Trp 
1895 1900 1905 



5724 



aac ccc aac get get tac aag ace get caa tec ate ttt gat ate 
Asn Pro Asn Ala Ala Tyr Lys Thr Ala Gin Ser lie phe Asp He 
1910 1915 1920 



5769 



aac aag gag ggt ctt cct ttg atg ate ctt gec aac ate cga ggt 
Asn Lys Glu Gly Leu Pro Leu Met He Leu Ala Asn He Arg Gly 
1925 1930 1935 



5814 



ttc tot gga gga cag ggt gat atg ttt gac get ate etc aag cag 
Phe Ser Gly Gly Gin Gly Asp Met Phe Asp Ala He Leu Lys Gin 
1940 1345 1950 



5859 



ggt tct aag ate gtt gac ggt etc teg aac ttc aag cag cea gtg 
Gly Ser Lys He Val Asp Gly Leu Ser Asn Phe Lys Gin pro Val 
1955 1960 1965 



5904 



ttc gtc tat gtt gtc ccc aac gga gag ctt cgt gga gga get tgg 
Phe val Tyr Val Val Pro Asn Gly Glu Leu Arg Gly Gly Ala Trp 
1970 1975 1980 



5949 



gtc gtg ttg gat cct act ate aac ctt gec aag atg 
Val Val Leu Asp Pro Thr He Asn Leu Ala Lys Met 

1985 1990 1995 

get gat gaa acc get cga gga gga att etc gag ceg 
Ala Asp Glu Thr Ala Arg Gly Gly He Leu Glu Pro 

2000 2005 2010 



gag atg tac 
Glu Met Tyr 

gaa ggt ate 
Glu Gly lie 



gtt gag ate aag ttc cga cga gac aag gtc ate get acc atg gag 
val Glu He Lys Phe Arg Arg Asp Lys Val He Ala thr Met Glu 
2015 2020 2025 



5994 



6039 



6084 



cga ttg gac gag acc tat gec tct etc aaa get gee ccg aac gac 
Arg Leu Asp Glu Thr Tyr Ala Ser Leu Lys Ala Ala Ser Asn Asp 
2030 2035 2040 



6129 
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tca acc aag tct gcg gag gag cga get aag agt get gag cca etc 
ser Thr liys Ser Ala Glu Glu Arg Ala Lys Ser Ala Glu Leu Leu 
204S 2050 2055 



6174 



aag gca aga gag act eta ctt caa ccg acg tac ttg cag att gca 
Lys Ala Arg Glu Thr Leu Leu Gin Pro Thr Tyr Leu Gin lie Ala 
206O 2065 2070 



6219 



cac ctt tac get gat etc cat gat cgt gtc gga cga atg gag gec 
10 His Leu Tyr Ala Asp Leu. His Asp Arg Val Gly Arg Met Glu Ala 
2075 2080 2085 



6264 



15 



aag ggt tgc gcg aag cga get gtc tgg get gag get cga cga ttc 
Lys Gly Cys Ala Lys Arg Ala Val Trp Ala Glu Ala Arg Arg Phe 
2030 2095 2100 



6309 



20 



ttc tac tgg cga ctt cga cga 

Phe Tyr Trp Arg Leu Arg Arg 
2105 2110 

tct aag ttc get get gec aac 

Ser Lys Phe Ala Ala Ala Asn 
2120 2125 



cgt etc aac gat gag cac ate ctg 6354 
Arg Leu Asn Asp Glu His He Leu 
2115 

ccg gat ctt act etc gag gag cga 6399 
Pro Asp Leu Thr Leu Glu Glu Arg 
2130 



25 caa aac att etc gac tct gtc gtc cag act gac etc act gat gac 
Gin Asn He Leu Asp Ser Val Val Gin Thr Asp Leu Thr Asp Asp 
2135 2140 2145 



6444 



30 



35 



cga gec acc get gaa tgg att 

Arg Ala Thr Ala Glu Trp He 

2150 2155 

get gec gtt gec gaa gtc cga 

Ala Ala Val Ala Glu Val Arg 

2165 2170 

ate age ttc gee gag acg gag 

He Ser Phe Ala Glu Thr Glu 
2180 21&5 



gag cag tct gca gaa 
Glu Gin Ser Ala Glu 
2160 

tec acc tac gtg teg 
Ser Thr Tyr Val Ser 
2175 

cga get gga gcg ttg 
Arg Ala Gly Ala Leu 
2190 



gag att get 
Glu He Ala 

aat aag att 
Asn Lys He 



cag ggc ttg 
Gin Gly Leu 



6489 



6534 



6579 



40 gtc get gtc ttg age act ttg aat gcg gaa gac aag aag gec ctt 
Val Ala Val Leu Ser Thr Leu Asn Ala Glu Asp Lys Lys Ala Leu 
2195 2200 2205 



6624 



-75- 



gtt tct age etc ggt etc taa 
val Ser Ser Leu Gly Leu 
2210 



6645 



<21Q> 3 

<211> 2214 

<2l2> PRT 

<213* Phaffia rhodozyina 



10 



<400> 3 

Met val Val Asp His Glu Ser Val Arg His Pne Xle Gly Gly Asn Ala 
15 10 15 

15 

Leu Glu Asn Ala Pro pro Ser Ser Val Thr Asp Phe Val Arg Ser Gin. 
20 25 30 

Asp Gly His Thr Val lie Thr Lys Val Leu lie Ala Asn Asn Gly He 
20 35 40 45 

Ala Ala Val Lys Glu lie Arg Ser Val Arg Lys Trp Ala Tyr Glu Thr 
50 55 60 

25 Phe Gly Asp Glu Arg Ala Xle Glu Phe Thr Val Met Ala Thr Pro Glu 
65 ?0 75 80 



30 



35 



Asp Leu Lys Val Asn Cya Asp Tyr lie Arg Met Ala Asp Arg Val Val 
85 90 95 

Glu Val Pro Gly Gly Thr Asn Asn Asn Asn His Ser Asn Val Asp Leu 

100 105 no 

xle Val Asp He Ala Glu Arg Phe Asn He His Ala Val Txp Ala Gly 
115 120 125 

Trp Gly His Ala Ser Glu Asn Pro Arg Leu Pro Glu Ser Leu Ala Ala 
130 135 140 



Ser Lys Asn Lys He Val Phe He Gly Pro Pro Gly Ser Ala Met Arg 
40 145 X50 155 160 



Ser Leu Gly Asp Lys He Ser Ser Thr He Val Ala Gin Ser Ala Gin 
165 170 175 



19:01 ^S'Ll l !ezssuBjdii)3 



76 



Val Pro Cye Met Ala Trp Ser Gly Ser Gly lie Thx Asp Thr Glu Leu 
180 1B5 190 

Ser Pro Gin Gly Phe Val Thr Val Pro Asp Gly Pro Tyr Gin Ala Ala 
195 200 205 

Cy9 Val Lye Thr Val Glu Asp Gly Leu Val Arg Ala Glu Lys lie Gly 
210 215 220 

iaeu Pro Val Met He Lys Ala Ser Glu Gly Gly Gly Gly Lys Gly lie 
225 230 235 240 

Arg Met Val His Ser Met Asp Thr Phe Lys Asn Ser Tyr Asn Ser Val 
245 250 255 

Ala Ser Glu Val Pro Gly Ser Pro lie Phe He Met Ala Leu Ala Gly 
260 265 270 

Ser Ala Arg Hi© i*eu Glu Val Gin Leu Leu Ala Asp Gin Tyr Gly Asn 
275 280 285 

Ala He Ser Leu Phe Gly Arg Asp Cys Ser Val Gin Arg Arg His Gin 
290 295 300 

Lys He He Glu Glu Ala Pro Val Thr He Ala Arg Pro Glu Arg Phe 
305 310 315 320 

Glu Glu Met Glu Lys Ala Ala Val Arg Leu Ala Lys Leu Val Gly Tyr 
325 330 335 



Val Ser Ala Gly Thr Val Glu Tyr Leu Tyr Ser His Ala Asp Asp Ser 
340 345 350 

Phe Phe Phe Leu Glu Leu Asn Pro Arg Leu Gin Val Glu His Pro Thr 
355 360 365 

Thr Glu Met Val Ser Gly Val Asn Leu Pro Ala Ala Gin Leu Gin lie 
370 375 380 



Ala Met Gly He Pro Leu Ser Arg He Arg Asp He Arg Val Leu Tyr 
385 390 395 400 



-77- 



Gly Leu Asp Pro His Thr Val Sear Glu He Asp Phe Asp Sear Sex Arg 
4 °S 410 415 

Ala Glu Ser Val Gin Thr Gin Arg Lys Pro Arg Pro Lys Gly His Val 
5 420 425 430 

He Ala Cys Arg He Thr Ser Glu Asn Pro Asp Glu Gly Phe Lys Pro 
435 440 445 

10 Ser Ala Gly Asp He Gin Glu Leu Asn Phe Arg Ser Asn Thr Asn Val 
450 45S 460 

Trp Gly Tyr Phe ser Val Gly Ala Thr Gly Gly He His Ser Phe Ala 

465 470 475 48 ° 

Asp Ser Gin Phe Gly His Val Phe Ala Tyr Gly Ser Asp Arg Thr Thr 
485 490 495 



Ala Arg Lys Asn Met Val He Ala Leu Lys Glu Leu Ser He Arg 
20 500 505 510 



Gly 



Asp Phe Arg Thr Thr Val Glu Tyr Leu He Thr Leu Leu Glu Thr Ser 
515 520 525 

25 Asp Phe Glu am Asn Ala He Thr Thr Ala Trp. Leu Asp Gly Leu He 
530 335 54 0 



Thr Asn Lys Leu Thr Ser Glu Arg Pro Asp Pro Ser Leu Ala Val He 

30 



545 550 555 5S0 



Cys Gly Ala He val Lys Ala His Val Ala Ser Glu Asn Cys Trp Ala 
565 570 575 

35 Glu Tyr Arg Arg Val Leu Asp Lys Gly Gin Val Pro Ser Lys Asp Thr 
580 S85 590 

Leu Lys Thr Val Phe Thr Leu Asp Phe He Tyr Glu Gly Val Arg Tyr 
595 600 6 o S 

40 

Asn Phe Thr Ala Ala Arg Ala Ser Leu Asn Thr Tyr Arg Leu Tyr Leu 
610 615 620 
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Aan (Sly Gly Lys Thr Veil Val Ser He Arg Pro Leu Ala Asp Gly Gly 
625 630 635 640 

Met Leu Val Leu Leu Asp Gly Arg Ser His Thr Leu Tyr Trp Arg Glu 
645 650 655 

Glu Val Gly Thr Leu Arg II© Gin Val Asp Ala Lys Thr Cys Leu He 
660 665 670 

Glu Gin Glu Asn Asp Pro Thr Gin Leu Arg Ser Pro Ser Pro Gly Lys 
675 680 685 

He lie Arg Phe Leu Val Glu Ser Gly Asp His He Ser Ser Gly Asp 
690 695 ' 700 

He Tyr Ala Glu Val Glu Val Met Lys Met He Leu Pro Leu He Ala 
705 710 715 720 

Gin Glu Ser Gly His val Gin Phe Val Lys Gin Ala Gly Val Thr Val 
725 730 735 

Asp Pro Gly Ala He He Gly He Leu Ser lieu Asp Asp Pro Thr Arg 
740 745 750 

Val Lys Lys Ala Lys Pro phe Glu Gly Leu Leu Pro Val Thr Gly Leu 
755 760 765 

Pro Asn Leu Pro Gly Asn Arg Pro His Gin Arg Leu Gin Phe Gin Leu 
770 775 780 



Glu Ser He Tyr Ser Val Leu Asp Gly Tyr Glu Ser Asp Ser Thr Ala 
785 790 795 800 

Thr He Leu Arg Ser Phe Ser Glu Asn Leu Tyr Asp Pro Asp Leu Ala 
805 810 815 

Phe Gly Glu Ala Leu Ser He He Ser Val Leu Ser Gly Arg Met Pro 
820 825 830 



Ala Asp Leu Glu Glu Ser He Arg Glu Val He Ser Glu Ala Gin Ser 
835 840 845 



-79- 

tiys Pro His Ala Glu Phe Pro Gly Sear Lys lie Leu Lyo Val Val Glu 
850 855 860 

Arg Tyr lie Asp Asn Leu Arg Pro Gin Glu Arg Ala Met Val Arg Thr 
5 865 870 875 880 

Gin He GlU Pro He Val Gly lie Ala Glu Lys Asn Val Gly Gly Pro 
885 890 895 

10 Lys Gly Tyr Ala Ser Tyr Val Leu Ala Thr He Leu Gin Lys Phe Leu 
900 905 910 

Ala Val Glu Ala Val Phe Ala Thr Gly Ser Glu Glu Ala He Val Leu 
915 920 925 

25 

Gin Leu Arg Asp Glu Asn Arg Glu Ser Leu Asn Asp Val Leu Gly Leu 
930 935 940 

Val Leu Ala His Ser Arg Leu Ser Ala Arg Ser Lys Leu Val Leu Ser 
20 9^5 950 955 960 

Val Phe Asp Leu He Lys Ser Met Gin Leu Leu Asn Asn Thr Glu Gly 
965 970 975 

25 Ser Phe Leu His Lys Thr Met Lys Ala Leu Ala Asp Met S»ro Thr I<ys ' 
980 985 990 

Ala Pro Leu Ala Ser Lys Val ser Leu Lys Ala Arg Glu He Leu He 
995 1000 1005 

30 

Ser Cys Ser Leu Pro Ser Tyr Glu Glu Arg Leu Phe Gin Met Glu 
1010 1015 1020 

35 Lys lie Leu Asn Ser Ser Val Thr Thr Ser Tyr Tyr Gly Glu Thr 
1025 1030 1035 

Gly Gly Gly His Arg Asn Pro Ser Val Asp Val Leu Thr Glu He 
1040 1045 10 50 

40 

Ser Asn Ser Arg Phe Thr Val Tyr Asp Val Leu Ser Ser Phe Phe 
1055 1060 1065 



-80- 

Lys His Asp Asp JPro Trp He Val Lou Ala Ser Leu Thr Val Tyr 
1070 1075 1080 

Val Leu Arg Ala Tyr Arg Glu Tyr Ser He Leu Asp Met Gin His 
1085 1090 1093 

Glu Gin Gly Gin Asp Gly Ala Ala Gly Val He Thr Trp Arg Phe 
1100 1105 1110 

Lys Leu Asn Gin Pro He Ala Glu Ser Ser Thr pro Arg Val Asp 
1115 1120 1125 

Ser Asn Arg Asp Val Tyr Arg- val Gly Ser Leu Ser Asp Leu Thr 
1130 1135 1140 

Tyr Lya He Lys Gin Ser Gin Thr Glu Pro Leu Arg Ala Gly Val . 
1145 1150 1155 

Met Thr Ser Phe Asn Asn Leu Lys Glu Val Gin Asp Gly Leu Leu 
1160 1165 1170 

Asn Val Leu Ser Phe Phe Pro Ala Tyr His His Gin Asp Phe Thr 
1175 1180 1185 

Gin Arg His Gly Gin Asp Ser: Ala Met Pro Asn val Leu Asn He 
1190 ' 1195 1200 

Ala He Arg Ala Phe Glu Glu, Lys Asp Asp Met Ser Asp Leu Asp 
1205 1210 1215 



Trp Ala Lys Ser Val Glu Ser Leu Val Met Gin Met Ser Ala Glu 
1220 1225 1230 

He Gin Lys Lya Gly He Arg Arg Val Thr Phe Leu Val Cys Arg 
1235 1240 1245 

Lys Gly val Tyr Pro Ser Tyr Phe Thr Phe Arg Gin Glu Gly Ala 
1250 1255 ■ 1260 



Gin Gly Pro Trp Arg Glu Glu Glu Lys He Arg Asn He Glu Pro 
1265 1270 1275 




•81- 



Ala Leu Ala Ser Gin Leu Glu Leu Asn Arg Leu Ser Asn Phe Lys 
1280 X2B5 1290 

Val Thr Pro lie She Val Asp Asn Arg Gin He His n e Tyr Lys 
* 1295 1300 1305 

6iy Val Gly Lys Glu Asn Ser Ser Asp Val Arg Phe Phe He Arg 
1310 1315 1320 

10 Ala Leu Val Arg Pro Gly Arg Val Gin Gly Ser Met Lys Ala Ala 
1325 1330 1335 

Glu Tyr Leu He S«r Glu Cys Asp Arg Leu Leu Thr Asp He Leu 
1340 1345 1350 

15 

Asp Ala Leu Glu Val val Gly Ala Glu Thr Arg Asn Ala Asp Cys 
1355 1360 1365 

Asn His val Gly He Asn Phe He Tyr Asn Val Leu Val Asp Phe 
20 1370 1375 1380 

Asp Asp Val Gin Glu Ala Leu Ala Gly Phe He Glu Arg His Gly 
1385 1390 1395 

25 Lys Arg Leu Trp Arg Leu Arg Val Thr Ala Ser Glu He Arg Met 
I 400 1405 1410 

Val Leu Glu Asp Asp Glu Gly A9n Val Thr Pro He Arg Cys Cys 
1415 1420 1425 

30 

He Glu Asn Val Ser Gly Phe val Val Lys Tyr His Ala Tyr Gin 
1430 1435 1440 

35 Glu Val Glu Thr Glu Lys Gly Thr Thr He Leu Lys Ser He Gly 
1445 1450 1455 

Asp Leu Gly Pro Leu His Leu Gin Pro Val Asn His Ala Tyr Gin 
I* 60 1465 1470 

40 

Thr Lys Asn Ser Leu Gin Pro Arg Arg Tyr Gin Ala His Leu Val 
1* 7 5 1480 1435 
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Gly Thr Thr Tyr val Tyr Asp Tyr Pro Asp Leu Phe Val Gin Ser 
1490 1495 1500 

Leu Arg Lys Val Trp Ala Glu Ala Ala Ala Lys II© Pro His Leu 
5 1505 1510 1515 

Arg Val Pro ser Glu Pro Leu Thr Ala Thr Glu Leu Val Leu Asp 
1520 1525 1530 

10 Glu Asn Asn Glu Leu Gin Glu Val Glu Arg Pro Pro Gly Ser Aan 
1535 1540 1545 

Ser Cys Gly Men Val Ala Trp lie Phe Thr Met Leu Thr Pro Glu 
1550 1555 1560 

15 

Tyr Pro Lys Gly Arg Arg val Val Ala He Ala Asn Asp lie Thr 
1565 1570 1575 

Phe Lys lie Gly Ser Phe Gly Pro Lys Glu Asp Asp Tyr Phe Phe 
20 1580 1585 1590 

Lys Ala Thr Glu lie Ala Lys Lys Leu Gly Leu Pro Arg He Tyr 
1595 1600 1605 

25 Leu Ser Ala Asn Ser Gly Ala Arg Leu Gly lie Ala Glu Glu Leu 
1610 1615 1620 

Leu His He Phe Lys Ala Ala Phe val Asp Pro Ala Lys Pro Ser 
1625 1630 1635 

30 

Met Gly He Lys Tyr Leu Tyr Leu Thr Pro Glu Thr Leu Ser Thr 
. 1640 1645 1650 

35 Leu Ala Lys Lys Gly Ser Ser Val Thr Thr Glu Glu He Glu Asp 
1655 1660 1665 

Asp Gly Glu Arg Arg His Lys He Thr Ala He Xle Gly Leu Ala 
1670 1675 1680 

40 

Glu Gly Leu Gly Val Glu Ser Leu Arg Gly Ser Gly Leu He Ala 
1685 1690 1695 
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Gly Ala Thar Thr Arg Ala a*yr Glu Glu Gly He Phe Thr He s e r 
1700 1705 1710 

Leu Val Thr Ala Arg Ser Val Gly lie Gly Ala Tyr Leu Val Arg 
5 1715 1720 1725 

Leu Gly Gin Arg Ala He Gin Val Glu Gly Asn Pro Met lie Leu 
1730 1735 1740 

10 Thr Gly Ala Gin Ser Leu Asn Lys val Leu Gly Arg Gin Val Tyr 
1745 1750 1755 

Thr Ser Aen Leu Gin Leu Gly Gly Thr Gin He Met Ala Arg Asn 
1760 1765 1770 

15 

Gly Thr Thr His Leu Val Ala Glu Ser Asp Leu Asp Gly Ala Leu 
1775 1780 1765 

Lys Val He Gin Trp Leu Ser Tyr Val Pro Glu Arg Lys Gly Lys 
20 1790 1795 1B00 

Ala lie Pro lie Trp Pro Ser Glu Asp Pro Trp Asp Arg Thr Val 
1805 1810 1815 

25 Thr Tyr Glu Pro Pro Arg Gly Pro Tyr Asp Pro Arg Trp Leu Leu 
1820 1825 1830 

Glu Gly Lys Pro Asp Glu Gly Leu Thr Gly Leu Phe Asp Lys Gly 
1835 1840 1845 

30 

Ser Phe Met Glu Thr Leu Gly Asp Trp Ala Lys Thr lie Val Thr 
1830 1855 I860 

35 Gly Arg Ala Arg Leu Gly Gly He Pro Met Gly Val He Ala Val 
1865 1870 1875 

Glu Thr Arg Thr Thr Glu Lys He He Ala Ala Asp Pro Ala Asn 
1B80 1885 1890 

40 

Pro Ala Ala Phe Glu Gin Lys He Met Glu Ala Gly Gin Val Trp 
1835 1900 1905 
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Asn Pro Asn Ala Ala Tyr Lys Thr Ala Gin Ser lie Phe Asp lie 
19i0 1915 1920 

Asn Lys Glu Gly Leu Pro Leu Met He Leu Ala Asn He Arg Gly 
5 1925 . 1930 1935 

Plie Ser Gly Gly Gin Gly Asp Met Phe Asp Ala He Leu Lys Gin 
1940 1945 1950 

10 Gly Ser Lys He Val Asp Gly Leu Ser Asn Phe Lys Gin Pro Val 
19S5 I960 1965 

Phe Val Tyr Val Val P.ro Asn Gly Glu Leu Arg Gly Gly Ala Trp 
1970 1975 1980 

15 

yal Val Leu Asp Pro Thr He Asn Leu Ala Lys Met Glu Met Tyr 
1985 1990 1995 

Ala Asp Glu Thr Ala Arg Gly Gly He Leu Glu Pro Glu Gly He 
20 2000 2005 2010 

Val Glu He Lys Phe Arg Arg Asp Lys Val He Ala Thr Met Glu 
2015 2020 2025 

25 Arg Leu Asp Glu Thr Tyr Ala Ser Leu Lys Ala Ala Ser Asn Asp 
2030 2035 2040 

Ser Thr Lys Ser Ala Glu Glu Arg Ala Lys Ser Ala Glu Leu Leu 
2045 2050 2055 

30 

Lys Ala Arg Glu Thr Leu Leu Gin Pro Thr Tyr Leu Gin lie Ala 
2060 2065 2070 

35 His Leu iyr Ala Asp Leu His Asp Arg Val Gly Arg Met Glu Ala 
2075 2080 2085 

Lys Gly Cys Ala Lys Arg Ala Val Trp Ala Glu Ala Arg Arg Phe 
2090 2095 2100 

40 

Phe Tyr Txp Arg Leu Arg Arg Arg Leu Asn Asp Glu His He 'Leu 
2105 2110 2115 
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Ser Lye Phe Ala Ala Ala Asn Pro Asp Leu Thr Leu Glu Glu Arg 
2120 2125 2130 

Gin Asn lie Leu Asp Ser Val Val Gin Thr Asp Leu Thr Asp Asp 
5 2135 2140 2145 

Arg Ala Thr Ala Glu Trp lie Glu Gin Ser Ala Glu Glu Zlo Ala 
2150 2155 2160 

10 Ala Ala val Ala Glu Val Arg Ser Thr Tyr val Ser Asn Lys He 
2165 2170 2175 

lie Ser Phe Ala Glu Thr Glu Arg Ala Gly Ala Leu Gin aly Leu 
2180 2185 2190 

15 

Val Ala Val Leu Ser Thr Leu Asn Ala Glu Asp Lys Lye Ala Leu 
2195 2200 2205 

Val Ser Ser Leu Gly Leu 
20 2210 
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<222> (21) - - (21) 

<223> n is a, c, g or t 

<220> 

<221> xnisc_feature 

5 <222> (24).. (24) 

<223> n is a, c, g or z 



<400> 4 

10 athggxxgent ayytngynmg nycngg 26 
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DNA 
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<400> 5 

acnacnaccc angcnecnec nckna 



25 



<210> 6 

<211> 26 

<212> DNA 

<2l3> Artificial 



10 



<400> 6 

ttaccctcgt cgtcctcaag aaccat 



26 



15 <210> 7 

<211> 26 

<212> DMA 

<213> Artificial 



20 



«400> 7 

tggatcctac tatcaacctg ccaaga 



26 



25 <210> 8 

<211> 26 

<212> DMA 

<213> Artificial 



30 



<;400> 8 

gtgaacactg tcttgagagt gtcctt 



26 



35 <210> 9 

<211> 20 

<212> DtfA 

<213> Artificial 



40 

<400> 9 

ccgctgctca gcttcagatt 



20 



<210> 10 

<211> 19 

<212> DNA 

<213> Artificial 
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<400> 10 

gattagatag ggatetagt 
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